Throughput maximization in wireless communication systems

ABSTRACT

A communication method comprising characterizing a communications channel, determining a data rate and optionally a power allocation strategy that maximizes channel throughput, and configuring a transmitter to send a transmit signal with said data rate and said optional power allocation strategy.

PRIORITY CLAIM

This application claims priority to U.S. Provisional Patent ApplicationSer. No. 60/152,712, filed on Oct. 20, 2003, entitled, “ThroughputMaximization In A Wireless Communication System,” incorporated herein byreference.

BACKGROUND

The channels encountered by many wireless communication systems oftenscatter the transmitted signal along its transmission path. Timevariation of the channel results in random fluctuations of the receivedpower level, or fading, making reliable communications difficult.

Transmitters typically employ channel coding techniques that mapsequences of input data to codewords that add redundancy to combat theeffects of fading and noise prior to transmission. Codewords consist ofa number of symbols carrying data at the transmission rate, the numberof information bits communicated with each symbol. The channel coherencetime is the amount of time the time-varying channel is assumed constant;signals transmitted within the coherence time are affected by a singlefading state. During transmission, each codeword is affected by one ormore fading states with the specific number affecting the communicationsperformance The coding delay is proportional to the codeword length andis often quantified in terms of the number of fading states affectingeach codeword; it significantly affects a system's reliablecommunications performance. A system is considered delay unconstrainedif it uses infinite-length codewords resulting in infinite coding delaysPractical communication systems are delay-limited; they usefinite-length codewords and therefore have a finite coding delay.

Conventional analysis of fading channels has been performed from thesingle-attempt paradigm. That is, the amount of information that can bereliably communicated with a single codeword transmission attempt hasbeen quantified. This approach works well for idealized,delay-unconstrained systems that transmit a single, infinite-lengthcodeword. However, practical systems are delay-limited since they usefinite-length codewords. Therefore, the conventional performance metricsbased on the single-attempt paradigm have drawbacks for delay-limitedsystems: ε-capacity—the highest transmission rate that can be supportedwith a probability of data loss no greater than ε—does not provide ameasure of error-free performance, while single-attempt delay-limitedcapacity—ε-capacity when data loss cannot be tolerated; that is, whenε=0—underestimates achievable performance.

SUMMARY

The problems noted above are solved in large part by a technique forthrough-put analysis and maximization in wireless communication systemsOne illustrative embodiment may be a a communication method comprisingcharacterizing a communications channel, determining a data rate thatmaximizes channel throughput, and configuring a transmitter to send atransmit signal with said data rate.

Another embodiment may comprise a transceiver that comprises a receiverconfigured to receive information characterizing a communicationschannel, and a transmitter configured to process said information todetermine a data rate that maximizes a throughput for the communicationschannel, and further configured to provide a transmit signal to thecommunications channel using said data rate.

Yet another embodiment may be a wireless communications system thatcomprises a remote transceiver configured to send informationcharacterizing a communications channel, and a local transceiverconfigured to receive said information and to process said informationto determine a data rate that maximizes a throughput for thecommunications channel, and further configured to transmit data to theremote transceiver using said data rate.

BRIEF DESCRIPTION OF THE DRAWINGS

For a detailed description of illustrative embodiments of the invention,reference will now be made to the accompanying drawings in which:

FIG. 1A shows a wireless communication system with a feedback channel;

FIG. 1B shows at least some of the internal components of a transceiverthat may be used as a transmission or reception device in the system ofFIG. 1A;

FIG. 1C shows a graph of ergodic capacity, with and without powercontrol, as a function of signal-to-noise ratio (SNR);

FIG. 1D shows a graph of minimum outage probability as a function oftransmitted power for a transmission rate of 1 nat/sec/Hz with constantpower and under long-term power constraint;

FIG. 1E shows a queueing model of a wireless communication system;

FIG. 2A shows a graph of codeword error probability vs. rate for 1) K=∞and 2) K=1;

FIG. 2B shows a graph of throughput vs. transmission rate when averagepower is approximately 10 dB for coding delays of one and infinity;

FIG. 2C shows a graph of throughput vs. transmission rate for variousvalues of K where average power is approximately 10 dB;

FIG. 2D shows a graph of outage probability vs. transmission rate forvarious values of K where average power is approximately 10 dB;

FIG. 2E shows a graph of maximum zero-outage throughput (MZT) as afunction of average power and coding delay vs. average power for variouscoding delays;

FIG. 2F shows a graph of MZT (retransmission scheme) as a function ofaverage power and K, where average power is 0, 5, and/or 10 dB,;

FIG. 20 shows a graph of transmission rate as a function of K foraverage power values of 0, 5 and/or 10;

FIG. 2H shows a graph of throughput vs. K achieved with an SNR of 10 dBfor various transmission rates;

FIG. 2I shows a graph of throughput vs. outage probability for variousvalues of K, where average power is 10 dB;

FIG. 2J shows a graph of optimal outage probability as a function of Kfor various average power values;

FIG. 2K shows a graph of fading throughput vs. transmission rate for anSNR. of 10 dB using simple re-transmission and incremental diversity;

FIG. 2L shows a graph of MZT vs. SNR for various values of K;

FIG. 2M shows a graph of MZT vs. coding delay for various values ofaverage power;

FIG. 2N shows a graph of transmission rate vs. coding delay for variousvalues of average power;

FIG. 2O shows a graph of throughput vs. rate for various quantities oftransmission attempts;

FIG. 3A shows a graph of fading state/power allocation vs. block index;

FIG. 3B shows histograms of transmitted power for a rate R of 2nats/sec/HZ, K=5, a long-term average power constraint Pav of 10 dB andthe transmission of 10000 codewords;

FIG. 3C shows histograms of transmitted power for R=2 nats/sec/Hz, K=5,Pav=10 dB, and the transmission of 10000 codewords;

FIG. 3D shows a graph of minimum outage probability vs. average powerfor R=1 nats/sec/Hz;

FIG. 3E shows a graph of minimum outage probability vs. R for along-term power constraint of Pav 10 dB;

FIG. 3F shows a graph of minimum outage probability vs. R for ashort-term power constraint of Pav=10 dB;

FIG. 4A shows a graph of delay-limited capacity and throughput for K=2vs. SNR;

FIG. 4B shows a graph of spectral efficiency for K=1 as a function ofSNR;

FIG. 4C shows a graph of MZT vs. K for constant, short-term andlong-term power allocation strategies;

FIG. 4D shows a graph of throughput vs. transmission rate for various Kand Pav=10 dB;

FIG. 4E shows a graph of optimal transmission rate vs. K for constant,short-term and long-term power allocation strategies;

FIG. 4F shows a graph of optimal outage probability vs. K for Pav=10 dB;

FIG. 4G shows a graph of MZT vs. Pav with K=5 for various values of peakpower;

FIG. 4H shows a graph of MZT vs. Pav for K=5 for various values of peakpower;

FIG. 4I shows a graph of MZT with a delayed transmission scheme vs. Kfor a long-term average power constraint Pav=10 dB;

FIG. 4J shows a graph of MZT with a delayed transmission scheme vs. Kunder a short-term average power constraint Pav=10 dB;

FIG. 4K shows a graph of throughput vs. transmission rate underlong-term and peak power constraints for K=5, Pav=10 dB, and variouspeak power constraint values;

FIG. 4L shows a graph of throughput vs. transmission rate undershort-term and peak power constraints for K=5, Pav=10 dB, and variousvalues of peak power constraints;

FIG. 4M shows a graph of optimal transmission rate vs. K under along-term average power constraint of Pav=10 dB and various peak powerconstraints;

FIG. 4N shows a graph of optimal transmission rate vs. K under ashort-term average power constraint of Pav=10 dB and various values ofpeak power constraint;

FIG. 4O shows a graph of optimal outage probability vs. coding delay Kunder the long-term average and peak power constraints with Pav=10 dB;

FIG. 4P shows a graph of optimal outage probability vs. coding delay Kunder the short-term average and peak power constraints with Pav=10 dB;

FIG. 5A shows a graph of MZT and near-optimal throughput as a functionof average waiting time for K=1 and Pav=10 dB;

FIG. 5B shows a graph of optimal transmission rate and near-optimaltransmission rate as a function of the average waiting time for K=1 andPav=10 dB;

FIG. 5C shows a graph of optimal arrival rate as a function of averagewaiting time for K=1 and Pav=10 dB;

FIG. 5D shows a graph of queue utilization for both optimal andsuboptimal strategies as a function of average waiting time for K=1 andPav=10 dB;

FIG. 5E shows a graph of MZT as a function of K for a waiting time ofD=20 and for Pav=10 dB; and

FIG. 5F shows a flow diagram of a technique used to optimize throughputduring data transmission over a wireless channel.

Notation and Nomenclature

Let Z, Z, and

represent a scalar, vector, and matrix, respectively Then diag(Z)=

is a diagonal matrix with diagonal elements Z, and I^(L)=diag(1, 1, . .. , 1) is the L×L identity matrix. Let E[g(z)] represent the expectedvalue of g(z). Let f(α) and F(α) represent the probability densityfunction (PDF) and cumulative distribution function (CDF) of the randomvector a respectively Let R and R₊ represent the real line and thepositive real line. Then R^(L) and R^(L×M) are the set of length-Lvectors and L×M matrices with elements in R, respectively. Similarly, R₊^(L) and R₊ ^(L×M) are the set of length-L vectors and L×M matrices withelements in R₊, respectively. For {a,b}εR, let I_(F)(a,b) be theindicator function, which is 1 if a>b and 0 if a<b. Let w˜N(m, V)represent a jointly Gaussian random vector with mean m and covariancematrix V. Similarly let x˜χ_(a) ² with a=1, 2, 3, . . . represent achi-squared random variable with a degrees of freedom. Finally let W(b)be Lambert's W function, the solution to xe^(x)=b.

DETAILED DESCRIPTION

Described below is an analysis framework for delay-limited systems basedon the multi-attempt paradigm. Average communications throughput ismaximized by optimizing system parameters and using the maximumthroughput as a measure of delay-limited communication performance.Discussed below awe two common scenarios, the first being only when thereceive has channel state information (CSI-R), while in the second bothtransmitter and receiver have information pertaining to the channel(CSI-RT). With CSI-R, the average transmit power is held constant andthroughput is maximized by performing optimal transmission rateselection. With CSI-RT, the transmitter knows the condition of thechannel at the time of transmission and can vary the power accordingly.The analysis described below is performed for an average powerconstraint on the transmitted signal. Also considered is the scenario ifan additional peak power constraint on the transmitted signal is added.Therefore, throughput is maximized by performing optimal rate selectionand power control. As a prerequisite for throughput maximization, theoutage minimization problem is solved for signals with both peak andaverage power constraints.

Maximum ε-throughput (MεT) and maximum zero-outage throughput are shownto be measures of best-case communications performance when there is,and is not, a restriction on the maximum number of transmission attemptsper codeword, respectively. A greater throughput is achieved with themulti-attempt approach than the single-attempt approach. The increasedthroughput comes at the cost of queueing delays that are not presentwhen transmitters are limited to a single transmission attempt.Therefore, also discussed is the situation in which throughput ismaximized with a constraint on the queueing delay.

Historically, communication systems have been examined and designedusing a layered approach. The Open System Interconnection (OSI) modelseparates communications systems into seven layers, including thephysical, data-link, network, and upper layers. The physical layer dealswith the transmission of unstructured data across the physical medium,while the data-link layer is responsible for creating a reliable datapipe between transmitter and receiver. This separation works well foranalyzing idealized communication systems; however, in practical systemsthere can be significant coupling between layers This suggests thatcross-layer optimization, rather than optimizing each layerindependently, should be performed to maximize the performance ofpractical communication systems.

The field of information theory has concerned itself primarily withunderstanding the performance of the physical layer. Informationtheoretic measures traditionally characterize the amount of informationthat can be transmitted reliably with a single transmission attempt forany codeword Single-attempt measures, for delay-limited anddelay-unconstrained systems, are motivated by the fact that the upperlayers will ensure reliable delivery of the data if there are errors inthe physical link. For delay-unconstrained systems the communicationsperformance is quantified by the ergodic capacity, the ultimate reliabledata rate over a fading channel. The concept of outage has beenintroduced for delay-limited systems If the transmission rate exceedswhat the channel condition will reliably allow then an outage occurs,resulting in a decoding error at the receiver. The outage concept leadsto ε-capacity (or outage capacity) and delay-limited capacity asmeasures of delay-limited communication performance. ε-capacity is thehighest transmission rate that can be supported with outage probabilityno greater than ε, while delay-limited capacity is simply ε-capacitywhen outages cannot be tolerated; that is, when ε=0.

Multi-Attempt Communication Paradigm

The single-attempt paradigm works well, theoretically, fordelay-unconstrained systems. Such systems buffer an infinite amount ofdata and then transmit a single infinite-length codeword. Here,error-free communications is possible as long as the transmission rateis less than the ergodic capacity of the channel. Since error-freecommunications is possible, data retransmission is unnecessary, makingthe purely physical-layer, single-attempt approach perfectly suited fordelay-unconstrained systems. For delay-limited systems, thesingle-attempt approach makes error-free communications very difficult.Traditional communication measures for delay-limited systems reflectthis: ε-capacity does not provide a measure of error-free communicationsperformance, while delay-limited capacity tends to underestimatecommunication performance.

The multi-attempt paradigm is more suitable for delay-limited systemsthan the single-attempt paradigm Delay-limited systems need not restrictthemselves to a single transmission attempt for each codeword—multipletransmission attempts can be performed since codewords are finitelength. In practical systems upper layers will often retransmit data toensure reliable communication. For example, variants of the link-layerARQ or transport layer TCP protocols are often used in real-worldsystems. There is a disconnect between how delay-limited systems aredesigned and used (practical, multi-attempt) and the measures(idealized, single-attempt) used to quantify their performance.Characterizing the maximum communications through-put, when multipletransmission attempts per codeword is permitted, may lead to a moreaccurate reflection of communications performance of delay-limitedsystems than the single-attempt measures used today.

For delay-limited systems, transmitters need not restrict themselves toa single transmission attempt per codeword. In fact, practicalcommunication protocols, such as TCP or ARQ, retransmit data when errorsoccur. There is a disconnect in the design of delay-limited systems(multi-attempt) and the conventional measures used to quantify theirperformance (single-attempt) in an effort to achieve optimal throughput.The following discussion lays out a foundation for the new analysisframework disclosed herein.

In many applications, the condition of the fading channel changes on atime scale that is much slower than the communications signalling. Thismotivates modeling the channel as a discrete-time, block-fading,additive white Gaussian noise (BF-AWGN) channel. In this model, each“block” of N symbols corresponds to the amount of time the channelremains constant, the channel coherence time. The system in the k^(th)block can be writteny _(k)=x_(k) h _(k) w _(k),  (1)with x_(k), y_(k)εR^(N) representing the system input and output. AGaussian noise process w_(k)˜N(0,I^(N)) is assumed. Scattering by theenvironment results in reflections of the transmitted signal that addconstructively or destructively with the original signal. The multipathinterference due to scattering is represented by a random multiplicativegain h_(k)εR on the transmitted signal. Below, x, y, w and h will beused to refer to the channel input, output, noise and gain when therelative position in the codeword is not important.

FIG. 1A provides a block diagram of a wireless communication systemmodel. The model contains a transmission channel 96 used to transmitdata from a transmitter 102 to a receiver 100. The model also contains adelay-less, error-free feedback link 98 used to relay acknowledgementsof codewords (whether they were successfully decoded or not) back to thetransmitter 102. The receiver 100 and the transmitter 102 may each be atransceiver, as shown in FIG. 1B. Specifically, a transceiver 110 maycomprise an antenna 122 coupled to a hybrid 120. The hybrid 120 mayconvert between the bi-directional data stream 132 and theunidirectional data streams 134, 136.

The hybrid 120 may be coupled to a receive chain comprising a gain andfilter 118, an analog-to-digital converter 116, a demodulator 114, and aprocessor 112. The processor 112 may communicate with a user or someother entity that uses the transceiver 110 to transmit or receiveinformation. The processor 112 may be coupled to a memory 130 that maybe used to store data and embedded software. The processor 112 also maybe coupled to a transmit chain comprising a modulator 128, adigital-to-analog converter 126 and a driver 124 that couples to thehybrid 120. A data signal received by the antenna 122 may be directed tothe receive chain by the hybrid 120. After the signal is filtered andthe gain is adjusted by the gain and filter 118, the signal may beconverted from analog to digital form by the converter 116 anddemodulated by the demodulator 114. The processor 112 then may processthe demodulated signal to extract receive signal information.Conversely, processor 112 may convert user data into a transmit datastream, which is modulated by the modulator 128 and converted to analogform by the converter 126. The signal may have its gain adjusted by thedriver 124, which also drives the antenna 122. After passing through thehybrid 120, the signal may be transmitted by the antenna 122.

Codewords span K blocks of the BF-AWGN channel, contain KN symbols, andcorrespond to a K block coding delay. Each of the KN symbols containinformation encoded at the transmission rate R nats/sec/Hz(nat:=bit/log_(e)(2)) More specifically, R denotes spectral efficiency,but also can be used to denote transmission rate and/or encoding rate.The time-variations of the channel are assumed to be independent andidentically distributed (i.i.d.) from block to block Blocks canphysically correspond to slots in time, frequency, or both. The K i.i.d.channel fades affecting each codeword areα:=[(α ₀,α₁, . . . , α_(K−1)],  (2)with α_(k)=|h_(k)|² (or α=|h|² when the relative position in thecodeword generally is not of substantial importance). This modelapplies, for example, to wireless multicarrier modulated systems with Kparallel subchannels.

It is assumed that the fading states follow a χ₂ ² (chi-squared with 2degrees of freedom) distribution withf(α)=e ^(−α)  (3)andF(α)=1−e ^(−α)  (4)the PDF and CDF, respectively. Such a distribution results when the |h|are Rayleigh distributed. This model is commonly used for wirelesscommunication systems without line-of-sight between transmitter andreceiver. Constructive interference results in a large α and thus alarge received signal power that is conducive to communication; thissituation is a “good” fade. Destructive interference results in a smallα≈0 and thus a small received signal power that is not conducive tocommunication; this situation is a “bad” fade.

A system's capacity is normally measured with an average powerconstraint on the input, denoted P_(av). Without such a restriction thecapacity of the channel may be infinite since the cardinality of theinput distribution is infinite; that is, xεR^(N). The transmitted powerin the k^(th) block of codeword is $\begin{matrix}{\gamma_{k}:={\frac{1}{N}\quad{\sum\limits_{n = 0}^{N - 1}{{{x(n)}}^{2}.}}}} & (5)\end{matrix}$Random fading results in a received power of α_(k)γ_(k). Since a unitvariance noise process E[w]=1 is assumed, α_(k)γ_(k) also equals thereceived signal-to-noise ratio (SNR) in the block. Additionally, sinceE[α]=1, the average received SNR is also γ. The results described hereincan easily be generalized to cover non-unity variance noise processes.

Two channel state information (CSI) scenarios are considered. The firstis when only the receiver has perfect, delay-less and error-free, CSI(CSI-R). In this case the transmitter cannot vary the average powerbased on the condition of the channel since it is unknown. Therefore,performance is maximized by transmitting at the average power. That isγ_(k)=P_(av), ∀k{0, 1, . . . , K−1}.

For the second scenario when both transmitter and receiver have perfectCSI (CSI-RT), the average transmit power need not be constant; it can bevaried in different blocks of the codeword based on the condition of thechannel. Let γ represent a power allocation policy, a strategy thatassigns the power allocation vectorγ(α):=[γ₀(α),γ₂(α), . . . , γ_(k−1)(α)]  (6)given the channel α. When performing power control, the transmitter mustbe careful not to violate the specified power constraint. A commonexample is the short-term average power constraint $\begin{matrix}{\left\langle {\underset{\_}{\gamma}\left( \underset{\_}{\alpha} \right)} \right\rangle:={{\frac{1}{K}\quad{\sum\limits_{k = 0}^{K - 1}\gamma_{k}}} \leq {P_{av}.}}} & (7)\end{matrix}$Here the average power in any block of the codeword can exceed P_(av),while the average within the entire codeword cannot. Another widely usedexample is the long-term average power constraintE _(α)[(<γ(α)>]≦P _(av).  (8)This is a more relaxed condition since it allows the average power forany particular codeword to exceed P_(av) as long as the averagelong-term power across all codewords does not.

In practical communication systems there is often a peak powerconstraint on the channel input in addition to the average powerconstraint. Non-linearities in power amplifiers force transmitters tolimit the peak power to avoid distortion of the transmitted signal.Similarly, peak power may be limited to comply with communicationstandards that limit the interference to other communication systems.The peak power constraint is defined asγ_(k) ≦P _(p) , ∀kε{0, 1, . . . , K−1}  (9)which limits the maximum average power that can be allocated in anyblock of a codeword. While not a constraint on the absolute peak, suchan approach allows the constraint of the peak power of the transmittedsignal while remaining in the class of capacity achieving Gaussianchannel inputs. The peak-to-average power ratio (PAR)is defined as$\begin{matrix}{{PAR} = {\frac{P_{p}}{P_{av}}.}} & (10)\end{matrix}$Note that P_(p)=∞ corresponds to no peak power constraint on the channelinput.

The constraints defined above are denoted as:I _(K) ^(st)(P _(av))={γ: γ(α)<P _(av))  (11)O _(k) ^(lt)(P _(av))={γ:E _(α)[(γ(α)]≦P _(av)}  (12)O _(K) ^(st)(P _(av) , P _(p))={γ:[γ(α)]≦P _(av),γ_(k) ≦P _(p) ∀k=0, 1,. . . , K−1}  (13)O _(K) ^(lt)(P _(av) , P _(p))={γ:E _(α)[γ(α)]≦P _(av), γ_(k) ≦P _(p)∀k=0, 1, . . . , K−1}  (14)or in words, as the set of all K-block power allocation policies thatsatisfy the short-term average (11), long-term average (12), short-termaverage and peak (13), and long-term average and peak (14) powerconstraints.

The instantaneous capacity (spectral efficiency), the highest reliabledata rate for a codeword, is found by maximizing the mutual informationover a frame of K i.i.d. channel fades, α. Assuming a Gaussian noiseprocess and with CSI-R, and a constant average transmit power of P_(av),the instantaneous capacity is given by $\begin{matrix}{{C_{K}\left( {\underset{\_}{\alpha},P_{av}} \right)}:={\frac{1}{K}\quad{\sum\limits_{k = 0}^{K - 1}{{\log\left( {1 + {\alpha_{k}P_{av}}} \right)}.}}}} & (15)\end{matrix}$With CSI-RT and for power allocation vector γ(α), is it given by$\begin{matrix}{{C_{K}^{pc}\left( {\underset{\_}{\alpha},{\underset{\_}{\gamma}\left( \underset{\_}{\alpha} \right)}} \right)}:{\frac{1}{K}\quad{\sum\limits_{k = 0}^{K - 1}{{\log\left( {1 + {\alpha_{k}{\gamma_{k}\left( \underset{\_}{\alpha} \right)}}} \right)}.}}}} & (16)\end{matrix}$In both cases it is achieved using random coding at the transmitter,with the elements of x_(k) drawn from a Gaussian codebook˜N(0, 1). Priorto transmission, each of the K blocks in the codeword is scaled by√{square root over (P_(av))} or √{square root over (γ_(k)(α))} (CSI-RT),respectively. Maximum a posteriori (MAP) detection is used at thereceiver. Instantaneous capacity is an asymptotic quantity that isachieved as N→∞.

Communication performance measures based on the instantaneous capacitydepend on the coding delay K. The delay unconstrained (K=∞) anddelay-limited (K<∞) cases are discussed below.

Delay Unconstrained Systems (K=∞)

If the sequence of fading states α_(k) for kε{0, 1, . . . , K−1} isasymptotically ergodic as K→∞, then the channels indexed by the blocklength N form a family that have the same capacity. This quantity isknown as ergodic capacity and with CSI-R is given byC _(erg)=lim C_(K)(α,P _(av))=E _(α)[log(1+αP _(av))],  (17)for an average power constraint on the channel input. The expectation isperformed with respect to the distribution of the channel fading processf(α). It is found by talking K→∞ in (15).

With CSI-RT ergodic capacity is given byC _(erg−pc)(P _(av)):=sup E _(α)[log(1+αγ].  (18)for an average power constraint P_(av) on the channel input. Again theexpectation is performed with respect to f(α). It is found by taking K→∞in (16) and selecting the optimal power allocation strategy thatsatisfies the average power constraint. Thecapacity achieving power allocation strategy $\begin{matrix}{{\gamma^{C}(\alpha)} = \left\lbrack {\frac{1}{\lambda^{C}} - \frac{1}{\alpha}} \right\rbrack_{+}} & (19)\end{matrix}$assigns power γ^(c)(α) to any block affected by fading state a. Here,λ^(c) chosen such that the power constraint is satisfied,$\begin{matrix}{{\int_{\lambda^{C}}^{\infty}{\left( {\frac{1}{\lambda^{C}} - \frac{1}{\alpha}} \right)\quad{\mathbb{d}{F(\alpha)}}}} = {P_{av}.}} & (20)\end{matrix}$For the model in which α˜χ₂ ², ergodic capacitor can be written$\begin{matrix}{C_{crg} = {{\mathbb{e}}^{- P_{av}}\quad{\int_{\frac{1}{P_{av}}}^{\infty}{\frac{{\mathbb{e}}^{- t}}{t}{\mathbb{d}t}}}}} & (21)\end{matrix}$with CSI-R and as $\begin{matrix}{C_{{crg} - {pc}} = {\int_{\lambda^{C}}^{\infty}{\frac{{\mathbb{e}}^{- t}}{t}{\mathbb{d}t}}}} & (22)\end{matrix}$with CSI-RT, with λ^(c) as the solution to $\begin{matrix}{{\frac{{\mathbb{e}}^{- \lambda^{C}}}{\lambda^{C}} - {\int_{\lambda^{C}}^{\infty}{\frac{{\mathbb{e}}^{- t}}{t}{\mathbb{d}t}}}} = {P_{av}.}} & (23)\end{matrix}$

For both CSI scenarios codewords are drawn from an infinite-lengthcodebook with i.i.d. symbols˜N(0, 1). Prior to transmission, the Nsymbols in each block ale scaled by either √{square root over (P_(av))}(CSI-R) or √{square root over (γ^(c)(α))} (CSI-RT). Since codewords areaffected by infinitely many fading states, the effect of the fadingchannel can be “averaged out” and reliable transmission at ergodiccapacity is possible Contrary to what (17) and (18) seem to suggest,ergodic capacity is not actually an average capacity, but rather thehighest rate that can be sustained on all channel states witharbitrarily small probability of error.

FIG. 1C compares ergodic capacity, C_(erg-pc), with power control (18)to ergodic capacity for constant power allocation, C_(erg), as afunction of the average power P_(av) for χ₂ ² fading. For small transmitpowers, the capacity with power control is larger than capacity withconstant power For larger transmit powers, the difference betweenvariable and constant power transmission shrinks, leading to theaccepted wisdom that power control yields negligible capacity gains overconstant power transmission. This shrinking difference occurs becausethe power allocated for each fading state (19) differs very little whenP_(av) is large.

Delay Constrained Systems (K<∞)

For finite K<∞, the sequence of fading states α_(k) for kε{0, 1, . . . ,K−1} cannot be considered asymptotically ergodic. As such theinstantaneous capacity becomes a random quantity When the channelcondition is good, a number of the K channel fades affecting a codewordare good and a large amount of information can be transmitted percodeword- Conversely, when the channel condition is bad only a smallamount of information can be reliably transmitted. An outage is declaredif the transmission rate is larger than the instantaneous capacity,R>C_(K)(α, P_(av)) (CSI-R) or R>C_(K)(α,γ(α)) (CSI-RT). For large N theoutage probability closely approximates the codeword error probability.

Since the instantaneous capacity is a random quantity, outages can occurno matter how small or how large the transmission rate. The outageprobability, the likelihood of outage events, is given by$\begin{matrix}\begin{matrix}{{P_{out}\left( {R,\mathcal{P}_{av},K} \right)}:={{Prob}\left\lbrack {R > {C_{K}\left( {\underset{\_}{\alpha},\mathcal{P}_{av}} \right)}} \right\rbrack}} \\{{= {{\mathbb{E}}_{\underset{\_}{\alpha}}\left\lbrack {I_{F}\left( {R,{C_{K}\left( {\underset{\_}{\alpha},\mathcal{P}_{av}} \right)}} \right)} \right\rbrack}},}\end{matrix} & (24)\end{matrix}$with CSI-R and by $\begin{matrix}\begin{matrix}{{P_{out}\left( {R,\gamma,K} \right)}:={{Prob}\left\lbrack {R > {C_{K}^{pc}\left( {\underset{\_}{\alpha},{\gamma(\alpha)}} \right)}} \right\rbrack}} \\{{= {{\mathbb{E}}_{\underset{\_}{\alpha}}\left\lbrack {I_{F}\left( {R,{C_{K}^{pc}\left( {\underset{\_}{\alpha},{\gamma(\alpha)}} \right)}} \right)} \right\rbrack}},}\end{matrix} & (25)\end{matrix}$with CSI-RT. That is, for any transmission rate R and power allocationpolicy (including constant power transmission) there is an associatedoutage probability P_(out)(R, P_(av), K) (CSI-R) or P_(out)(R, γ, K)(CSI-RT). Using this, ε-capacity is defined asC _(c)(P _(av) , K):=sup{R:P_(out)(R, P _(av) , K)≦ε},  (26)with CSI-R and byC _(c) ^(pc)(P _(av) , K):=sup{R:P _(out)(R, γ, K)≦ε:γεO_(K)}.   (27)with CSI-RT, where O_(K) is the set of all valid power allocationstrategies over which the optimization is performed and can representeither O_(K) ^(st)(P_(av)),O_(K) ^(st)(P_(av), P_(p)), O_(K)^(lt)(P_(av)), or O_(K) ^(lt)(P_(av), P_(p)). ε-capacity represents thehighest rate that can be supported with outage probability less than εand may be used to quantify the communications performance ofdelay-limited communications systems in fading channels.

The need for a measure of error-free performance leads to the notion ofdelay-limited capacity $\begin{matrix}{{C_{dl}\left( {\mathcal{P}_{av},K} \right)}:=\left. C_{c} \right|_{c = 0}} & (28) \\{\quad{= {\log\left( {1 + {\alpha_{\min}\mathcal{P}_{av}}} \right)}}} & (29)\end{matrix}$with CSI-R, and byC _(dl) ^(pc)(P _(av) , K):=C _(c) ^(pc)|_(c=0)  (30)with CSI-RT. When the minimum channel gainα_(min):=min{α}=0  (31)which is the case for many common fading distributions including χ₂ ²delay-limited capacity is 0 for all K<∞ with CSI-R. While for CSI-RTdelay-limited capacity is 0 for K=1, however, it is possible to havenon-zero delay-limited capacity for K>1

Since the transmit power can be varied based on the condition of thechannel with CSI-RT, the power allocation policy used affectsperformance One policy of particular importance is the one thatminimizes the outage probability. This policy can also be used tomaximize the transmission rate for a target outage probability; that is,it can be used to achieve C_(c) ^(pc). The outage minimization problemcan be stated asmin{P _(out)(R, γ, K):γεO _(K)}  (32)The solution to (32) is known as the outage minimizing power allocationstrategy and has been found for O_(K)=O_(K) ^(st)(P_(av)) andO_(K)=O_(K) ^(lt)(P_(av)), the short-term and long-term average powerconstraints.-Overviewed below are the solutions for these cases.

Under the short-term average power constraint O_(K)=O_(K) ^(st)(P_(av))in (32) and the outage minimizing power allocation policy is$\begin{matrix}{{{\gamma_{k}^{st}\left( \underset{\_}{\alpha} \right)} = \left\lbrack {{\lambda^{st}\left( \underset{\_}{\alpha} \right)} - \frac{1}{\alpha_{k}}} \right\rbrack_{+}},} & (33)\end{matrix}$with $\begin{matrix}{{\lambda^{st}\left( \underset{\_}{\alpha} \right)} = {{\frac{1}{\mu}\quad{\sum\limits_{l = 0}^{{\mu{(\underset{\_}{\alpha})}} - 1}\frac{1}{\alpha_{(l)}}}} + {\frac{K}{\mu\left( \underset{\_}{\alpha} \right)}\mathcal{P}_{av}}}} & (34)\end{matrix}$for μ(α)ε{1, 2, . . . , K} and α₍₀₎≧α₍₁₎≧ . . . ≧α_((K−1)) an orderedpermutation of the fading states affecting the codeword

Under the long-term average power constraint O_(K)=O_(K) ^(lt)(P_(av))in (32), and the outage minimizing power allocation policy takes theform $\begin{matrix}{{\gamma^{lt}\left( \underset{\_}{\alpha} \right)} = \left\{ \begin{matrix}{{\hat{\gamma}\left( \underset{\_}{\alpha} \right)},} & {w\text{/}\quad{prob}\quad 1} & {{{if}\quad\underset{\_}{\alpha}} \in {R_{1}\left( s_{1}^{*} \right)}} \\{{\hat{\gamma}\left( \underset{\_}{\alpha} \right)},} & {w\text{/}\quad{prob}\quad w^{*}} & {{{if}\quad\underset{\_}{\alpha}} \in {{{\overset{\_}{R}}_{1}\left( s_{1}^{*} \right)} - {R_{1}\left( s^{*} \right)}}} \\{0,} & {w\text{/}\quad{prob}\quad\left( {1 - w^{*}} \right)} & {{{if}\quad\underset{\_}{\alpha}} \in {{{\overset{\_}{R}}_{1}\left( s_{1}^{*} \right)} - {R_{1}\left( s_{1}^{*} \right)}}} \\{0,} & {w\text{/}\quad{prob}\quad 1} & {{{if}\quad\underset{\_}{\alpha}} \notin {{R_{1}\left( s_{1}^{*} \right)}\bigcup{{\overset{\_}{R}}_{1}\left( s_{1}^{*} \right)}}}\end{matrix} \right.} & (35)\end{matrix}$whereR ₁(s)={α:({circumflex over (γ)}(α)<s}  (36){overscore (R)} ₁(s)={α:({circumflex over (γ)}(α)≦s}  (37){overscore (R)} ₁(s)−R ₁(s)={α:({circumflex over (γ)}(α)=s}  (38)represent sets of fading states differentiated by the amount powerallocated for each fading state. ThenP ₁(s)=∫_(R) ₁ _((s))({overscore (γ)}(α)dF(α)  (39){overscore (P)} ₁(s)=∫_({overscore (R)}) ₁ _((s))({overscore(γ)}(α)dF(α)  (40)is the average power allocated over these sets. Thens ₁ *=sup{s:P ₁(s)<P_(av)}  (41)is maximum aver age power allocated for any fading state and$\begin{matrix}{w_{1}^{*} = \frac{\mathcal{P}_{av} - {\mathcal{P}_{1}\left( s^{*} \right)}}{{{\overset{\_}{\mathcal{P}}}_{1}\left( s^{*} \right)} - {\mathcal{P}_{1}\left( s^{*} \right)}}} & (42)\end{matrix}$is the probability that the codeword is transmitted when this maximum isachieved. Both s₁* and w* ensure the average transmitted power acrossall fading states is P_(av) as desired. Finally, $\begin{matrix}{{{\hat{\gamma}}_{k}\left( \underset{\_}{\alpha} \right)} = \left\lbrack {{\lambda^{lt}\left( \underset{\_}{\alpha} \right)} - \frac{1}{\alpha_{k}}} \right\rbrack_{+}} & (43)\end{matrix}$is the form of the power allocated for fading state α, with$\begin{matrix}{{\lambda^{lt}\left( \underset{\_}{\alpha} \right)} = \left( \frac{{\mathbb{e}}^{KR}}{\prod\limits_{l = 0}^{{\mu{(\underset{\_}{\alpha})}} - 1}\alpha_{(l)}} \right)} & (44)\end{matrix}$μ(α)ε{1, 2, . . . , K}.

FIG. 1D plots the minimum outage probability for K=1 as a function ofP_(av) under constant power allocation and under the long-term averagepower constraint. The gain of power control is seen for a target outageprobability, the average power required when performing power control isless than when using constant power transmission.

Throughput and Fading Channels

Within the single attempt paradigm, zero-outage (error-free)communications is often viewed as an all-or-nothing phenomenon. Fordelay-unconstrained systems it is possible to transmit reliably at ratesapproaching ergodic capacity, while for delay-limited systemsdelay-limited capacity is zero for many fading distributions ofinterest.

A new analysis framework for delay-limited systems in fading channels isdescribed below. FIG. 1E shows a queue 302 receiving data at a rate λand transferring data to a server 300 using a first-in, first-out (FIFO)methodology. By modeling the communications systems as a queue, it ispossible to relate the throughput of the system with the amount ofinformation passing through the queue 302. The server 300 in thequeueing model encompasses the details of both the physical anddata-link layers, shown in FIGS. 1A-1B. The server 300 takes codewordsthat arrive in the queue 302 and attempts transmission repeatedly untilthe channel condition allows successful transmission. The service timefor a codeword is based on the number of transmission attempts requiredand can vary from system to system based on the particularretransmission scheme. Using this approach the throughput is simply thetransmission rate divided by the service time—the amount of data in eachcodeword divided by the number of transmission attempts required forsuccessful decoding. Maximizing the throughput through the queue isequivalent to maximizing the throughput of the delay-limitedcommunication system

As discussed above, two main CSI scenarios are considered: when only thereceiver has CSI (CSI-R) and when both transmitter and receiver have CSI(CSI-RT). With CSI-R, throughput is maximized using optimal rateselection, while with CSI-RT it is maximized by optimal rate selectionand power control. For both scenarios, the maximum throughput under themulti-attempt paradigm exceeds that under under the single-attemptparadigm, That is, for the same coding delay a higher throughput ispossible my allowing multiple transmission attempts per codeword, ratherthan a single attempt.

1. Multi-Attempt Throughput Maximization

1.1 Cross-Layer Queueing Model

By maximizing the communications throughput within the multi-attemptframework, the communications throughput of the physical layer (which isresponsible for selecting the transmission rate R) and data-link- layer(which is responsible for data retransmission in the face of errors) arejointly maximized. This joint optimization can be used to predict thebest case performance for any retransmission scheme in fading channels.

The physical and data-link layers can be modeled jointly as a queue. Inthis model, codewords arrive into the queue encoded at rate R,and-therefore contain RKN nats. The server takes a codeword from thequeue and attempts transmission. When an outage occurs, the codeword isretransmitted until successful transmission or until a maximum number oftransmission attempts is reached. The number of transmission attemptsfor each codeword, the service time, is a random quantity due to therandom nature of the fading channel. The number of transmission attemptsis used to quantify the service time, since each transmission attemptcorresponds to K blocks and therefore corresponds to the channelcoherence time scaled by a factor of K. The service time distribution,the probability that s attempts are required for successfultransmission, depends on the nature of the retransmission scheme, thetransmission rate and power, and the statistics of the fading channel Ingeneral, the probability that a codeword's service time, S, will be sattempts for successful transmission is $\begin{matrix}\begin{matrix}{{{Prob}\left( {S = s} \right)} = {{{Prob}\left( {\bigcap\limits_{i = 1}^{n - 1}{out}_{i}} \right)}\quad\left\lbrack {1 - {{Prob}\left( {out}_{s} \middle| {\bigcap\limits_{i = 1}^{s - 1}{out}_{i}} \right)}} \right\rbrack}} \\{{= {{{Prob}\left( {\bigcap\limits_{i = 1}^{s - 1}{out}_{i}} \right)} - {{Prob}\left( {\bigcap\limits_{i = 1}^{s}{out}_{i}} \right)}}},}\end{matrix} & (1.1)\end{matrix}$which is the probability of outage events on the first s - 1 attemptsmultiplied by the probability of successful transmission on the s^(th)attempt given that it was previously in error.

The service time distribution can be used to determine the expectedservice time E[S] and the expected service rate 1/E[S] of a codeword.The average amount of data passing through the queue with eachtransmission attempt is R/E[S] (nats/sec/Hz), the encoding rate dividedby the average number of attempts for successful transmission. Forexample if the data transmission rate is R=10 nats/sec/Hz and takes onaverage E[S]=2 transmission attempts per codeword, then the averagethroughput is 5 nats/sec/Hz. Using this idea, the maximum throughput ofthe system is defined as $\begin{matrix}{{T_{\max}\left( {P_{av},K,P_{p}} \right)}:={\sup\limits_{R}\quad\sup\limits_{\gamma}{\left\{ {\frac{R}{{\mathbb{E}}\left\lbrack {S\left( {R,\gamma} \right)} \right\rbrack}:{\gamma \in O_{K}}} \right\}.}}} & (1.2)\end{matrix}$where the supremum is taken over all transmission rates and powerallocation strategies in O_(K). Either constant power transmission,γ=P_(av) for a system with CSI-R, or the transmitter performing powercontrol with O_(K) ε{O_(K) ^(st)(P_(av)), O_(K) ^(st)(P_(av)), O_(K)^(st)(P_(av), P_(p)), O_(k) ^(lt)(P_(av), P_(p))} for a system withCSI-RT, may be considered. It is noted that T_(max)(P_(av), K, P_(p))predicts the best case performance for a particular multi-attemptscheme, coding delay K, average power constraint P_(av) and peak powerconstraint P_(p). By matching the multi-attempt scheme used in theanalysis to one that is used in practice, this analysis can be used topredict the best case communication performance of practicalretransmission algorithms, (i.e., ARQ), in fading channels

If the transmission rate is R, then the amount of data successfullydecoded with any transmission attempt is either 0 or R, depending onwhether an outage does or does not occur, respectively. The maximumaverage throughput is a representative measure of communicationsperformance.

2. Throughput Maximization with Optimal Rate Selection

When only the receiver has CSI, the transmitter does not vary thetransmit power level based on the condition of the channel. As such, forthis scenario, the transmitter uses the average power, γ_(k)=P_(av),∀k{0, 1, . . . , K−1}. In this case the optimization in (1.2) is overthe encoding rate and $\begin{matrix}{{T_{\max}\left( {P_{av},K} \right)}:={\sup\limits_{R}\quad{\frac{R}{{\mathbb{E}}\left. {S\left( {R,P_{av},K} \right)} \right\rbrack}.}}} & (2.1)\end{matrix}$

T_(max) (P_(av),K) represents the optimal balance between the amount ofinformation in each codeword and the frequency at which codewords passthrough the queueing system. As R→0, the amount of information carriedper codeword shrinks and the throughput approaches 0. Similarly, as R→∞,outages become frequent and E[S]→∞, resulting in a throughput thatapproaches 0.

The optimal transmission rate depends significantly on the coding delayK. FIG. 2A(1)(2) illustrates the codeword error probability when K=∞ andK=1 (for scheme RT), respectively. The optimal operating point when K=∞is obvious, the transmit rate is set at a rate as close to ergodiccapacity as possible with codeword error probability close to zero.However, for K=1 the optimal transmission rate is not immediatelyobvious. Examining the system from a throughput perspective in FIG. 2B,both systems are shown to have a transmission rate that maximizesthroughput. For K=∞, R=C_(erg) is the unique throughput maximizingtransmission rate For K=1, for scheme RT, there is also a uniquethroughput maximizing transmission rate For delay-limited systems, theoptimal transmission rate depends on the particular retransmissionscheme being used and its expected service time. In general, it ispossible to specify conditions on the expected service time, for aparticular retransmission scheme, that guarantee the existence of aunique throughput maximizing transmission rate

Theorem 2.0.1. If 1/E[S(R)] is a log-concave function of R, then (1.2)has a unique global maximum.

Proof. Let T(R)=R/E[S(R)]; then f(R)=log T(R)=log R+log 1/E[S(R)]. Iflog 1/E[S(R)] is a concave function then f(R) is also concave, since logR is concave and the sum of two concave functions is also concave. Thenfrom convex optimization theory, f(R) has a unique maximizer on theconvex set R₊. Let R* be the argument that maximizes f(R). If f(R) iscomposed with the monotonically increasing function e^(x), thene^(f(R))=T(R) has the same maximizer R*. Hence (1.2) has a uniquemaximum. □

This is a sufficient, but not necessary, condition for the existence ofa unique solution. It is possible for T(R)=R/E[S] to be log-concavewithout I/E[S] being log-concave. This scenario would also have a uniquemaximizer for the throughput. The uniqueness of the optimal transmissionrate is of practical importance. Often (1.2) cannot be solved explicitlyand numerical techniques must be used. Fortunately, if 1/E[S] islog-concave, any numerical solution to (1.2) is globally optimal.

Since only the receiver has CSI, the transmitter has no way of knowingan outage has occurred unless it receives feedback from the receiverwhich can be relayed to the transmitter in the form of retransmissionrequests. For both schemes a single bit of feedback is required for eachcodeword to relay (un)successful decoding acknowledgements back to thetransmitter. The amount of feedback per block is 1/K bits, whichapproaches to 0 as the coding delay increases K→∞.

When the transmitter is allowed to retransmit each codeword as manytimes as necessary, then zero-outage, or error-free, communications ispossible. The maximum throughput is termed maximum zero-outagethroughput (MZT). The name is appropriate as it quantifies the maximumerror-free throughput of a communications system for a particularretransmission scheme.

2.1 Maximum zero-outage throughput with scheme RT (MZT_(RT))

2.1.1 Mathematical Formulation

For scheme RT all transmission attempts have the same probability ofsuccess or failure, and (1.1) becomesProb(S=s)=[P _(out)(R, P _(av) , K)]^(s−1)[1−P _(out)(R, P _(av) ,K)]  (2.2)The service time distribution becomes geometric on the positive integerswith parameter [1−P_(out)(R, P_(av), K)] and having the well-known mean$\begin{matrix}{{{\mathbb{E}}\lbrack S\rbrack} = {\frac{1}{1 - {P_{out}\left( {R,P_{av},K} \right)}}.}} & (2.3)\end{matrix}$Using (1.2), maximum zero-outage throughput for scheme RT is defined asMZT _(RT)(P _(av) , K)=sup R[1−P _(out)(R, P _(av) , K)]  (2.4)When the channel fading is good a rate R is achieved; when the channelfading is bad rate 0 is achieved due to outage. By optimizing over thetransmission rate, the maximum average throughput across all channelfading states is MZT_(RT)(P_(av), K).

Note that (2.4) is simply the transmission rate R multiplied by thesuccess probability [1−P_(out)(R, P_(av), K)] and that this samethroughput can be achieved without any feedback to the transmitter. Thisoccurs because the feedback only ensures that codewords in error areretransmitted and is not used to improve the throughput. Without suchfeedback, codewords in error are discarded by the receiver, and thetransmitter sends a new codeword with the next transmission attempt.MZT_(RT) can also be thought of as selecting the best rate and outageprobability pair (R, ε) based on the statistics of the channel thatmaximizes the throughput. Typically, communications performance infading channels is measured with ε-capacity, the highest rate for agiven outage probability ε and a small value of ε is normally chosensuch as ε=0.01 However, fixing ε may yield a low throughput. MZT_(RT)finds the best (R, ε) pair that maximizes the communications throughput.

2.1.2 Uniqueness of MZT_(RT)

In general, (2.4) does not have a closed form due to the difficulty ofobtaining exact expressions for the outage probability for common fadingdistributions. However, it is possible use properties of the fadingdistribution to show that a unique global maximizer exists.

Theorem 2.1.1. If the probability density f_(c)(R) of the instantaneouscapacity over a single block, C=log(1+αP_(av)), is log-concave, thenthere is a unique transmission rate that achieves MZT_(RT)(P_(av), K)

Proof. The instantaneous capacity over a codeword spanning K blocks is$\begin{matrix}{{C_{K} = \frac{C_{(1)} + C_{(2)} + \ldots + C_{(K)}}{K}},} & (2.5)\end{matrix}$where C_((k)) is the instantaneous capacity in the k^(th) block havingdistribution f_(c)(R). The outage probability P_(out) (R, P_(av),K)=Prob(C_(K)<R) is the CDF of the random variable C_(K) evaluated at R.The PDF of C_(K) is then $\begin{matrix}{{f_{C_{K}}(R)} = \frac{{f_{C}(R)}*{f_{C}(R)}*\ldots*{f_{C}(R)}}{K}} & (2.6)\end{matrix}$where * is-convolution. Since f_(c)(R) is log-concave, (2.6) is alsolog-concave since the convolution of log-concave functions is alsolog-concave. Then both $\begin{matrix}{{P_{out}\left( {R,P_{av},1} \right)} = {\int_{0}^{R}{{j_{C_{K}}(x)}\quad{\mathbb{d}x}}}} & (2.7)\end{matrix}$and $\begin{matrix}{\left( {1 - {P_{out}\left( {R,P_{av},1} \right)}} \right) = {\int_{R}^{\infty}{{f_{C_{K}}(x)}\quad{\mathbb{d}x}}}} & (2.8)\end{matrix}$are log-concave. Since 1/E[S]=[1P_(out)(R, P_(av), 1)] is log-concave,by Theorem 2.0.1 there is a unique transmission rate corresponding toMZT_(RT)(P_(av), K). □

This result is general and holds for any fading distributioncorresponding to a log-concave instantaneous capacity over a singleblock.

Proposition 2.1.2. If the channel fading α follows a Φ₂ ² distribution,then the PDF of the instantaneous capacity f_(c)(R) is log-concave.

Proof. If the fading process follows a χ₂ ² distribution, then$\begin{matrix}\begin{matrix}{{P_{out}\left( {R,P_{av},1} \right)} = {{Prob}\left\lbrack {R > {\log\left( {1 + {\alpha\quad P_{av}}} \right)}} \right\rbrack}} \\{= {{{Prob}\left\lbrack {\left( \frac{{\mathbb{e}}^{R} - 1}{P_{av}} \right) > \alpha} \right\rbrack}.}}\end{matrix} & (2.9)\end{matrix}$The CDF of a χ_(d) ² random variable is F(x)=1−ε^(−z) and therefore$\begin{matrix}\begin{matrix}{{F_{C}(R)} = {P_{out}\left( {R,P_{av},1} \right)}} \\{= {1 - {{\mathbb{e}}^{- {(\frac{{\mathbb{e}}^{R} - 1}{P_{av}})}}.}}}\end{matrix} & (2.10)\end{matrix}$Then the PDF and its derivatives are given by $\begin{matrix}{{{f_{C}(R)} = {\frac{{\mathbb{e}}^{R}}{P_{av}}\quad{\mathbb{e}}^{- {(\frac{{\mathbb{e}}^{R} - 1}{P_{av}})}}}},} & (2.11) \\{{{f_{C}^{\prime}(R)} = {\frac{{\mathbb{e}}^{R}}{P_{av}^{2}}\quad{{\mathbb{e}}^{- {(\frac{{\mathbb{e}}^{R} - 1}{P_{av}})}}\left( {P_{av} - {\mathbb{e}}^{R}} \right)}}},} & (2.12) \\{{f_{C}^{''}(R)} = {\frac{{\mathbb{e}}^{R}}{P_{av}^{3}}\quad{{{\mathbb{e}}^{- {(\frac{e^{R} - 1}{P_{av}})}}\left( {P_{av}^{2} - {3\quad{\mathbb{e}}^{R}} + {\mathbb{e}}^{2R}} \right)}.}}} & (2.13)\end{matrix}$After some algebraic manipulations it can be shown thatf _(c)(R)f _(c)″(R)≦[f _(c)′(R)]²,  (2.14)which is a necessary and sufficient condition for log-concavity. □

Proposition 2.1.2 implies that with χ₂ ² fading, there is a uniquetransmission rate that maximizes the communications throughput. However,explicit expressions for the transmission rate and the maximumthroughput have been elusive. Examining (2.7), determining the outageprobability involves integrating (2.6). However, a closed formexpression for (2.6), let alone (2.7), may not exist since it is theconvolution of one or more complicated functions. Nonetheless, when K=1,(2.4) admits a semi-explicit solution.Theorem 2.1.3. If K=1 and the channel fading α follows a χ₂ ²distribution then $\begin{matrix}{{{MZT}_{RT}\left( {P_{av},1} \right)} = {{W\left( P_{av} \right)}{\mathbb{e}}^{- {{(\frac{c^{W{(P_{av})}} - 1}{P_{av}})}.}}}} & (2.15)\end{matrix}$Proof. If α is a χ₂ ² random variable, then P_(out)(R, P_(av), 1)=1−e^(−(sR−1/Pav)). Using this, let T(R)=Re^(−(sR−1/Pav)). Taking thederivative with respect to R and equating with zero, the transmissionrate corresponding to the critical point is the solution toRe^(R)=P_(av). The solution to this is the optimal transmission rateR*=W(P_(av)). This rate corresponds to a throughput maximum (rather thana minimum) from Theorem 2.1.1. Substituting this back into T(R), (2.15)is obtained. □2.1.3 Properties of MZT_(RT)The performance of many communication systems is often maximized withrespect to a fixed outage probability. Normally, system designers selectthe highest transmission rate that supports a predetermined outageprobability (or in practice, packet error rate). However, a greatercommunications throughput is possible if the constraint of a targetoutage probability is removed.Theorem 2.1.4. MZT_(RT) is always greater than or equal to thethroughput achieved by transmitting at ε-capacity.Proof For a fixed outage probability ε, ε-capacity is given byC _(c) :=supR{R:P _(out)(R, P _(av) , K)≦ε}.  (2.16)Every transmission rate R=C_(c) corresponds to an outage probability ε.This results inT _(c) =C _(C)(1−ε)  (2.17)as the throughput for outage probability ε. Therefore (4.5) is a singlepoint on the curveT _(RT)(R)=[1−P _(out)(R, P _(av) , K)],  (2.18)with P_(out)(R, P_(av), K) the outage probability achievable fortransmission rate R and coding delay K. SinceMZT _(RT) =supR{T _(RT)(R)}.  (2.19)thenMZT _(RT) ≧T _(c).  (2.20)□

With the single-attempt approach, zero-outage communications must beguaranteed with a single transmission attempt (i.e., ε=0 in the aboveanalysis) When 0 is in the support of the fading process, as is the casefor χ₂ ² fading, T_(c) |_(c=D)=C_(dl)=0 is the highest single-attemptthroughput. However, with the multi-attempt approach, ε is notnecessarily 0, since codeword retransmission is permitted. Thereforewith 0 in the support of the fading process, it is possible to haveMZT_(RT)>0 This illustrates the power of the multi-attempt approach;zero-outage communications can be possible with the multi-attemptapproach when it is not possible with the single-attempt approach.

It is a known phenomenon that the outage probability approximates thecode-word error probability when N is large. Since zero-outagecommunication is possible with K=∞, this suggests that the outageprobability converges asymptotically to I_(F)(R, C_(erg)) as K→∞.

Theorem 2.1.5. The outage probability P_(out)(R, P_(av), K) converges tothe indicator function I_(F)(R, C_(erg)) as K→∞.

Proof. It is possible to bound (24) using Chebyshev's inequality forR<C_(erg) by0≦P _(out)(R, P _(av) , K)≦β/K  (2.21)and for R>C_(erg) by1≧P _(out)(R, P _(av) , K)≧1−β/K  (2.22)In both cases β is a constant. Taking K→∞ producesP _(out)(R, P _(av) , K)=I _(F)(R, C _(erg)).  (2.23)□

Intuitively as K grows, codewords become more immune to the effects ofthe fading channel; blocks in the codeword that experience a goodchannel fade can compensate for blocks that suffer from a bad fade.

Theorem 2.1.6. MZT_(RT)(P_(av), K) converges to ergodic capacity as K→∞.

Proof. Taking K→∞ and using (2.23), producesMZT _(RT)(P _(av), ∞)=sup R[1−I _(f)(R, C _(erg).  (2.24)The maximization is trivial since P_(out)(R, P_(av), K) takes only twovalues: 0 or 1. MZT_(RT)(P_(av), K) does converge to C_(erg) as K→∞, andthus the optimal transmission rate R_(MZT) _(RT) *=MZT_(RT)(P_(av),∞)=C_(erg). □

Then K=∞, ergodic capacity is viewed as a hard-limit on the transmissionrate. From the ergodic capacity theorem, if the transmission rate isless than ergodic capacity then the codeword error probability canalmost always be driven to 0. On the other hand, if the transmissionrate is larger than ergodic capacity, then codeword errors almost waysoccur. Thus only transmission rates below ergodic capacity result innon-zero throughput.

However, when K<∞ the situation is different The outage probabilityapproaches 1 only as R→∞ and for any finite transmission rate other thanR=0 it is possible to transmit data successfully. More specifically,when K<∞ any finite transmission rate other than R=0 results in non-zerothroughput. That is, the notion of capacity as a hard-limit on thetransmission rate is “softened.” This is due to the fact that multipletransmission attempts per codeword is permitted and there is no need toguarantee successful transmission with a single attempt. Note that usinga transmission rate above ergodic capacity does not contradict anyinformation theoretic notions since the resulting throughput is alwaysbelow ergodic capacity.

Theorem 2.1.7. Non-zero throughput is achievable for transmission ratesR>C_(erg) when K<∞.

Proof. Let R=C_(erg)+ε. By Theorem 2.1.5 it is seen thatP_(out)(C_(erg)+ε, P_(av), K)<1 for K<∞. Using this inequality in (2.4),it is seen that the throughput R[1−P_(out)(C_(erg)+ε, P_(av), K)]>0. □

The intuition behind this phenomenon is that for finite K<∞ theinstantaneous capacity (15) is a random quantity. For fadingdistributions that have support on R₊, this means that no matter howhigh the transmission rate there is non-zero probability that thechannel state is good enough to support it. Hence non-zero throughput ispossible for R>C_(erg) if K<∞, In the limit when K→∞, the instantaneouscapacity becomes a constant—the erogdic capacity—and it is virtuallyimpossible for the channel to support R>C_(erg).

2.1.4 Simulation results

Now, the properties of MZT_(RT) are empirically verified via Monte Carlosimulation. For the purposes of these simulations it is assumed that thechannel fading follows a χ₂ ² distribution.

Theorem 2.1.1 and Proposition 2.1.2 disclose that if the channel fadingis C2 then a unique solution for MZT_(RT) exists. This phenomenon can beeasily observed in FIG. 2C, which plots throughput vs. transmission ratefor various values of K and P_(av)=10 dB. It is seen that each curve hasa unique maximum corresponding to MZT_(RT)(P_(av), K). FIG. 2C alsoempirically verifies Theorem 2.1.7 since it is apparent that for finiteK<∞ if R>C_(erg) then non-zero throughput is possible. It is also seenthat as K increases the throughput achievable for R>C_(erg) decreases.

Theorem 2.1.5 discloses that that the outage probability as a functionof R, P_(out)(R, P_(av), K), converges to I_(F)(R, C_(erg)) as K→∞. Thiseffect can be seen in FIG. 2D, which plots outage probability vs.transmission rate for P_(av)=10 dB and for various values of K. Clearlythe larger the K the closer the outage probability is to I_(F)(R,C_(erg)). Theorem 2.1.6 also shows that MZT_(RT)(P_(av), K) converges toC_(erg)(P_(av)) as K→∞. This is verified in FIG. 2D, which plotsMZT_(RT)(P_(av), K) as a function of transmit power P_(av) for variousK. As K increases it can be seen that MZT_(RT)(P_(av), K) approachesC_(erg)(P_(av)), verifying Theorem 2.1.6. FIG. 2E also demonstrates theperformance penalty suffered by delay-limited systems with scheme RTwhen compared to ergodic capacity. For example, at a target throughputof 1 nats/sec/Hz, MZT_(RT)(P_(av), K) is about 1.18 dB away from ergodiccapacity when K=100, 2.21 dB away when K=20, 2.96 dB away when K=10, and5.54 dB when K=1.

FIG. 2F plots MZT_(RT)(P_(av), K) as a function of coding delay K forP_(av) ε {0, 5, 10}} dB. Again, this illustrates that the maximumthroughput approaches C_(erg)(P_(av)) as K→∞. MZT_(RT)(P_(av), K) alsoappears to be a monotonically increasing function of K. This does makesense intuitively; larger coding delays result in more fading statesaffecting each codeword and more opportunity to “average out” poorchannel conditions and therefore reach a higher throughput Often systemdesigners assume that for a “large enough” coding delay K the ergodicnature of the fading channel can be captured and an outage probabilityclose to zero can be achieved. Such a scenario would result in athroughput equivalent to the transmission rate FIG. 2H plots thethroughput achieved if R=βC_(erg) for various βε[0, 1]. It is seen forβ=0.5 and β=0.7 that the throughput is close to the transmission ratewhen K≈10 and K≈25, respectively. For β=0.99 the throughput is below thetransmission rate even for K=500. Clearly, the closer the transmissionrate is to ergodic capacity the harder it is to capture the erogdicnature of the channel.

The transmission rate of the system should be selected to achieve theMZT_(RT) rather than attempting to achieve C_(erg), which isunattainable for finite K. FIG. 2G plots the optimal transmission rateR_(MZT) _(Rt) * as a function of K for P_(av) ε{0, 5, 10} dB. Therelationship between R_(MZT) _(RT) * and K is not as obvious R_(MZT)_(RT)* does converge to ergodic capacity as K→∞, but not monotonicallyand can fluctuate for small K. This can be attributed to the behavior ofthe tail of f_(C) _(K) (R), the distribution of the instantaneouscapacity, as a function of K with χ₂ ² fading For different fadingprocesses the behavior of R_(MZT) _(RT) * may be different. Thishighlights the need to properly select R_(MZT) _(RT) * by solving (2.4).

Selecting a transmission rate that overshoots (is larger than) R_(MZT)_(RT) * will result in a loss in throughput when compared to MZT_(RT),as seen in FIG. 2C. The severity of the throughput loss depends on K.For larger K, the throughput vs. transmission rate curve is narrower,which shows that the throughput loss is more severe if the optimaltransmission rate is overshot. In the limit when K=∞, selecting atransmission rate infinitesimally larger than the optimal one yieldszero throughput. Therefore, system designers must be very careful not toovershoot the optimal transmission rate for larger K. This phenomenonalso suggests a trade off: throughput vs. system robustness For largerK, MZT_(RT) is higher but the loss in throughput if the optimaltransmission rate is overshot is more severe For smaller K, MZT_(RT) islower but the loss in throughput if the optimal rate is overshot is lesssevere.

Underestimating the transmission rate also yields similar losses inthroughput. This is seen in FIG. 2H where compared are MZT_(RT)(P_(av),K), achieved using R_(MZT) _(RT) , to the throughput achieved usingR=βC_(erg), for various βε(0, 1). For β=0.5 and β=0.7 the throughputcurves plateau as a function of K and are far below MZT_(RT)(P_(av), K)This occurs because the transmission rate is underestimated and a largerrate, yielding a larger throughput, can be supported for that K. Theseexamples clearly illustrate the importance of properly selecting thetransmission rate.

The uniqueness of MZT_(RT) implies that there is a unique outageprobability P_(out)(R_(MZT) _(RT) *, P_(av), K) that corresponds to themaximum throughput. FIG. 2I plots throughput vs. the optimal outageprobability for various values of K and P_(av)=10 dB, MZT_(RT)(P_(av),K) corresponds to the peak of each curve. FIG. 2J plots the optimaloutage probability as a function of K for P_(av) ε{0, 5, 10} dB. Fromboth figures it is seen that the optimal outage probability can be high,especially for small coding delays. For example if P_(av)=10 dB and K=1then the optimal outage probability is P_(out)(R_(MZT) _(RT) *, P_(av),1)=0.53. This suggests that in order to maximize the through-put it isnecessary to lose over half of the transmitted codewords to outage. Whencompared to the conventional practice of constraining the outageprobability to be rather small, for example ε=0.01, the result issubstantially significant. The penalty of outages means zero rate for Kconsecutive blocks, which is small for small K and large for large K.For small K the ergodic nature of the channel cannot be captured, andthe instantaneous capacity is highly variables. Throughput is maximizedby exploiting this variability and transmitting codewords with a highrate and therefore a large outage probability, since the penalty foroutage is small. For larger but finite K, the instantaneous capacity isstill a random quantity but is not highly variable and codewords beginto see “average” channels. Since the penalty for outage is large,throughput is optimized by selecting rates that the “average” channelcan support. Intuitively, this makes sense since the optimal outageprobability decreases as K→∞ and at the extreme K=∞ the optimal outageis 0, corresponding to a maximum throughput of C_(erg).

2.2 Maximum Zero-Outage Throughput with Scheme ID (MZT_(ID))

2.2.1 Mathematical Formulation

Scheme ID is a more complex retransmission scheme than scheme RT.Feedback is used not only to guarantee that codewords are successfullyreceived but also to improve communications performance. This isaccomplished by a more intelligent receiver design. Since the receiversaves, rather than discards, codewords that are in outage and optimallycombines them with subsequent retransmitted versions (prior to making adecoding decision), the outage probability decreases with eachretransmission.

In general when combining J codewords with a K block coding delay usingMRRC the instantaneous capacity is given by $\begin{matrix}{{{C_{K}\left( {P_{av},K,J} \right)} = {\frac{1}{K}{\sum\limits_{k = 0}^{K - 1}\quad{\log\quad\left( {1 + {\sum\limits_{j = 1}^{J}\quad{\alpha_{k,j}P_{av}}}} \right)}}}},} & (2.25)\end{matrix}$and the associated outage probability isP _(out)(R, P _(av) , K, J)=Prob[R>C _(K)(P _(av) , K, J)].  (2.26)As a result, the probability of outage after s consecutive transmissionattempts becomes $\begin{matrix}{{{Prob}\left( {\underset{j = 1}{\bigcap\limits^{s}}{out}_{j}} \right)} = {{Prob}\left\lbrack {\left( {R > {\frac{1}{K}{\sum\limits_{k = 0}^{K - 1}\quad{\log\quad\left( {1 + {\sum\limits_{j = 1}^{s}\quad{\alpha_{k,j}P_{av}}}} \right)}}}} \right)\bigcap\left( {R > {\frac{1}{K}{\sum\limits_{k = 0}^{K - 1}\quad{\log\quad\left( {1 + {\sum\limits_{j = 1}^{s - 1}\quad{\alpha_{k,j}P_{av}}}} \right)}}}} \right)\bigcap} \right.}} & (2.27) \\\left. {\cdots\bigcap\left( {R > {\frac{1}{K}{\sum\limits_{k = 0}^{K - 1}\quad{\log\quad\left( {1 + {\alpha_{k,1}P_{av}}} \right)}}}} \right)} \right\rbrack & (2.28)\end{matrix}$for scheme ID. In general, (2.28) is difficult to solve analytically andis computed numerically. The numerical solution to (2.28) can be used todetermine the service time distribution (1.1), the expected service timeE[S], and the associated maximum zero-outage throughput for scheme ID(MZT_(ID)).2.2.2 Uniqueness of MZT_(ID) when K=1The difficulty of determining a closed form expression for the servicetime is apparent from (2.28). However, when K=1 and the channel fadingis χ_(x) ², (2.28) admits a closed form expression and therefore theservice time distribution and expected service-time can be determinedanalytically.Theorem 2.2.1. If K=1 and the channel fading is χ_(d) ² then$\begin{matrix}{{{\mathbb{E}}\lbrack S\rbrack} = \frac{{\mathbb{e}}^{R} + P_{av} - 1}{P_{av}}} & (2.29)\end{matrix}$Proof. For simplicity, let x=c^(n)−1/P_(av). Then since K=1 theprobability of outage on successive transmission attempts (2.28) isgiven by $\begin{matrix}\begin{matrix}{{{{Prob}\left( {out}_{1} \right)} = {{Prob}\left\lbrack {x > \alpha_{1}} \right\rbrack}},} \\{{{{Prob}\left( {{out}_{2}\bigcap{out}_{1}} \right)} = {{Prob}\left\lbrack {\left( {x > {\alpha_{1} + \alpha_{2}}} \right)\bigcap\left( {x > \alpha_{1}} \right)} \right\rbrack}},} \\\vdots \\{{{Prob}\left( {\underset{j = 1}{\bigcap\limits^{s}}{out}_{j}} \right)} = {{Prob}\left\lbrack {\left( {x > {\sum\limits_{j = 1}^{s}\quad\alpha_{j}}} \right)\bigcap\left( {x > {\sum\limits_{j = 1}^{s - 1}\quad\alpha_{j}}} \right)\bigcap\cdots\bigcap\left( {x > \alpha_{1}} \right)} \right\rbrack}}\end{matrix} & (2.30)\end{matrix}$where Σ_(j=1) ^(s) α_(j) is a χ₂ ², random variable. This can bedetermined by integrating the joint distribution of the α_(j)'s over theappropriate region $\begin{matrix}{{{Prob}\left( {\underset{j = 1}{\bigcap\limits^{s}}{out}_{i}} \right)} = {\int_{0}^{x}{\int_{0}^{x - \alpha_{1}}{\cdots{\int_{0}^{x - \quad{\sum\limits_{j = 1}^{a - 1}\quad\alpha_{i}}}{{f\left( {\alpha_{1},\alpha_{2},\ldots\quad,\alpha_{s}} \right)}\quad{\mathbb{d}\alpha_{s}}\quad{\mathbb{d}\alpha_{s - 1}}\ldots\quad{{\mathbb{d}\alpha_{1}}.}}}}}}} & (2.31)\end{matrix}$Since the channel gains are assumed to be i.i.d, f(α₁, α₂, . . . ,α_(s))=Π_(i=1) ^(s) (α_(i)). Then (2.31) becomes $\begin{matrix}{{{Prob}\left( {\underset{j = 1}{\bigcap\limits^{s}}{out}_{j}} \right)} = {1 - {{\mathbb{e}}^{- x}{\sum\limits_{j = 0}^{s - 1}\quad{\frac{x^{j}}{j!}.}}}}} & (2.32)\end{matrix}$Using this the service time distribution is found (1.1) to be$\begin{matrix}\begin{matrix}{{{Prob}\left( {S = s} \right)} = {{{Prob}\left( {\underset{k = 1}{\bigcap\limits^{s - 1}}{out}_{k}} \right)} - {{Prob}\left( {\underset{k = 1}{\bigcap\limits^{s}}{out}_{k}} \right)}}} \\{= {{\mathbb{e}}^{- x}\frac{x^{s - 1}}{\left( {s - 1} \right)!}}}\end{matrix} & (2.33)\end{matrix}$The expected service time can then be computed as $\begin{matrix}\begin{matrix}{{{\mathbb{E}}\lbrack S\rbrack} = {\sum\limits_{s = 1}^{\infty}\quad{s\left\lbrack {{Prob}\left( {S = s} \right)} \right\rbrack}}} \\{= {{\mathbb{e}}^{- x}{\sum\limits_{s = 1}^{\infty}\quad\frac{{sx}^{s - 1}}{\left( {s - 1} \right)!}}}} \\{= {{\mathbb{e}}^{- x}\left( {{\sum\limits_{s = 1}^{\infty}\quad\frac{\left( {s - 1} \right)x^{5 - 1}}{\left( {s - 1} \right)!}} + {\sum\limits_{s = 1}^{\infty}\quad\frac{x^{s - 1}}{\left( {s - 1} \right)!}}} \right)}} \\{= {{\mathbb{e}}^{- x}\left( {{\sum\limits_{s = 0}^{\infty}\quad\frac{{sx}^{s}}{s!}} + {\sum\limits_{s = 1}^{\infty}\quad\frac{x^{s - 1}}{\left( {s - 1} \right)!}}} \right)}} \\{= {{\mathbb{e}}^{- x}\left( {{x{\sum\limits_{s = 1}^{\infty}\quad\frac{x^{s - 1}}{\left( {s - 1} \right)!}}} + {\mathbb{e}}^{x}} \right)}} \\{= {{\mathbb{e}}^{- x}\left( {{x\mathbb{e}}^{x} + {\mathbb{e}}^{x}} \right)}} \\{= {1 + {x.}}}\end{matrix} & (2.34)\end{matrix}$By substituting the value of x, (2.29) is obtained as desired. □

Using the form of E[S] described above, MZT_(ID)(P_(av), 1) can bewritten as $\begin{matrix}{{{MZT}_{ID}\left( {P_{av},1} \right)} = {\sup\limits_{R}{\frac{{RP}_{av}}{{\mathbb{e}}^{R} + P_{av} - 1}.}}} & (2.35)\end{matrix}$Note that this equation is quite different from (2.4). For scheme ID,the throughput is no longer the transmission rate multiplied by thesuccess probability The difference is due to the receiver performingMRRC with the retransmitted codewords.

As was the case for scheme RT, the special case when K=1 for scheme IDalso admits a semi-explicit solution for the optimal transmission rateand therefore for MZT_(ID)(P_(av), 1).

Theorem 2.2.2. If the channel fading is χ₂ ² then (2.35) has a uniquemaximizer

Proof. Let T(R)=RP_(av)/e^(N)+P_(av)−1. Its first two derivatives aregiven by $\begin{matrix}{{{T^{\prime}(R)} = \frac{\mathcal{P}_{av}\left( {{\mathbb{e}}^{R} + \mathcal{P}_{av} - 1 - {R\quad{\mathbb{e}}^{R}}} \right)}{\left( {{\mathbb{e}}^{R} + \mathcal{P}_{av} - 1} \right)^{2}}},} & (2.36) \\{{T^{''}(R)} = {\frac{{\mathbb{e}}^{R}{\mathcal{P}_{av}\left\lbrack {{R\left( {{\mathbb{e}}^{R} - \mathcal{P}_{av} + 1} \right)} - {2\left( {{\mathbb{e}}^{R} + \mathcal{P}_{av} - 1} \right)}} \right\rbrack}}{\left( {{\mathbb{e}}^{R} + \mathcal{P}_{av} - 1} \right)^{3}}.}} & (2.37)\end{matrix}$After some algebraic manipulations it can be shown thatT(R)T″(R)≦[T′(R)]²  (2.38)is satisfied, which means that T(R) is a log-concave function. Then fromconvex optimization theory, log T(R) has a unique maximizer R_(MZT)_(ID) * on the convex set R₊. If T(R) is composed with the monotonicincreasing function e^(x), then e^(T(R)) has the same maximizer R_(MZT)_(ID)* completing the proof. □Theorem 2.2.3. If K=1 and the channel gains follow α χ₂ ² distributionthen $\begin{matrix}{{{MZT}_{ID}\left( {\mathcal{P}_{av},1} \right)} = {\frac{\left( {{\mathcal{W}\left( \frac{\mathcal{P}_{av} - 1}{\mathbb{e}} \right)} + 1} \right)\mathcal{P}_{av}}{{\mathbb{e}}^{({{\mathcal{W}{(\frac{\mathcal{P}_{av} - 1}{\mathbb{e}})}} + 1})} + \mathcal{P}_{av} - 1}.}} & (2.39)\end{matrix}$Proof. Let T(R)=RP_(av)/e^(R)+Pav−1. Let f(R)=log[T(R)], which isconcave in R since T(R) is log-concave as shown in Theorem 2.2.2. Takingthe derivative of f(R) with respect to R and equating with zero, ut isseen that transmission rate corresponding to the critical point is thesolution to e^(R)(1−R)+P_(av)−1=0, which turns out to be R_(MZT) _(ID)*=W(Pav−1/e)+1. It is known that this rate corresponds to a throughputmaximum (rather than a minimum) from Theorem 2.2.2. Substituting thisback into T(R), (2.39) is obtained. □

Note that Theorem 2.2.2 was proved by illustrating that T(R) directly isa log-concave function, rather than by showing that 1/E[S] islog-concave. Indeed, 1/E[S] is not log-concave in this case,highlighting the fact that Theorem 2.1.1 is a sufficient, but notnecessary, condition for uniqueness.

2.2.3 Properties of MZT_(ID)

Since it optimally combines multiple codewords to make a decodingdecision, scheme ID may perform at least as well as scheme RTIntuitively, if discarding codewords in error is optimal, then theoptimal combining scheme would adopt this strategy. This can be provenexplicitly for K=1.

Theorem 2.2.4. If K=1 and the channel fading follows a χ₂ ² distributionthen MZT_(ID)(P_(av), 1)≧MZT_(RT)(P_(av), 1)

Proof. Let MZT_(RT)(P_(av), 1)=R₁[1−P_(out)(R₁, P_(av),1)]=R_(e)−(sR1-1/Pav) and let MZT_(ID)(P_(av), 1)=RdPav/c_(R) 2+Pav−1,where R₁=R_(MZT) _(RT)* and R₂=R_(MZT) _(ID) *. Then $\begin{matrix}\begin{matrix}{{\log\quad{{MZT}_{RT}\left( {\mathcal{P}_{av},1} \right)}} = {{\log\quad R_{1}} - \left( \frac{{\mathbb{e}}^{R_{1}} - 1}{\mathcal{P}_{av}} \right)}} \\{\leq {{\log\quad R_{1}} - {\log\left( {1 + \frac{{\mathbb{e}}^{R_{1}} - 1}{\mathcal{P}_{av}}} \right)}}} \\{= {{\log\quad R_{1}} - {\log\left( \frac{{\mathbb{e}}^{R_{1}} - 1 + \mathcal{P}_{av}}{\mathcal{P}_{av}} \right)}}} \\{\leq {{\log\quad R_{2}} - {\log\left( \frac{{\mathbb{e}}^{R_{2}} - 1 + \mathcal{P}_{av}}{\mathcal{P}_{av}} \right)}}} \\{= {\log\quad{{{MZT}_{ID}\left( {\mathcal{P}_{av},1} \right)}.}}}\end{matrix} & (2.40)\end{matrix}$The first inequality comes from the fact that x≧log(1+x)∀x≧0 as well asR₁, R₂>0 and P_(av)>0 The second inequality occurs since R₂ is theoptimizer for scheme ID. Finally, since log(x) is a monotonicallyincreasing function of x, MZT_(RT)(P_(av), 1)≦MZT_(ID)(P_(av), 1),completing the proof. □

The gain in the throughput from scheme ID over scheme RT is due to thefact that the feedback is implicity used to optimize the transmissionrate. Incremental diversity reduces the outage probability on eachretransmission attempt. This allows the transmitter to more aggressivelyselect the transmission rate resulting in a larger throughput than withscheme RT.

2.2.4 Simulation Results

Some of the Theorems and properties of MZT_(ID) are now empiricallyverified via Monte Carlo simulation. In all of the simulations it isassumed that the channel fading is χ₂ ².

Throughput is plotted against transmission rate in FIG. 2K with K=1 andP_(av)=10 dB for both schemes RT and ID. MZT_(ID) and MZT_(RT)correspond to the peaks of each curve. It is seen that for Scheme IDthere is also a single peak in the throughput vs. rate curve and aunique optimal transmission rate. This empirically validates Theorem2.2.2. It is also seen that the throughput for scheme ID is clearlyhigher than that using scheme RT, verifying Theorem 2.2.4. Moreover, itis seen that the gap between Scheme ID and Scheme RT is larger for largeR. This is due to the fact that for large R there ale frequent outagesand more retransmission attempts This results in more opportunities forcodeword combining yielding greater throughput.

As a means to estimate the performance penalty for having a finitecoding delay K, both MZT_(ID)(P_(av), K) and MZT_(RT)(P_(av), K) areplotted as a function of the transmitted power P_(av) for various K inFIG. 2L. For a target throughput of 1 nat/sec/Hz and a coding delay K=1,scheme ID provides a 1.22 dB gain over scheme RT. When K=10 and K=100the gain shrinks to 0.18 dB and 0.04 dB, respectively. The decreasingdifference between the two schemes can be explained as follows For agiven transmission rate outage events are more likely when K is small.Therefore, there are more retransmissions and more opportunities forcodeword combining, resulting in higher throughput for scheme ID vs.scheme RT. In the limit as K→∞ the difference will shrink to zero asoutages will never occur and codeword combining will not be exploited.

FIG. 2M plots MZT_(ID)(P_(av), K) as a function of the coding delay Kfor various values of P_(av). The maximum throughput for scheme RT isalso plotted for reference. As is the case for scheme RT it can be seenthat the maximum throughput with scheme ID can be far from ergodiccapacity for finite K. Also, as K→∞, MZT_(ID)(P_(av),K)→C_(erg)(P_(av)). As with scheme RT, the convergence appears to bemonotonic.

As is the case for scheme RT, the transmission rate for scheme ID shouldbe selected carefully. Rather than trying to achieve C_(erg)(P_(av)) thetransmitter should select the transmission rate to maximize thethroughput. FIG. 2N plots the optimal transmission rate R_(MZT) _(ID)(P_(av), K) against the coding delay K for various values of P_(av). Theoptimal transmission rate for scheme RT is included for reference. It isseen that as K→∞ that the optimal transmission rate converges to ergodiccapacity R_(MZT) _(ID) *(P_(av), K)→C_(erg)(P_(av)). Similar to schemeRT, the convergence is not monotonic and the optimal transmission canfluctuate a great deal as a function of K, especially for small K. Forscheme ID the optimal transmission rate can actually be higher than theergodic capacity of the channel. This does not contradict informationtheoretic capacity theorems as the resulting throughput is always lessthan ergodic capacity. However, this is rather non-intuitive as inpractice transmission rates lower than ergodic capacity are normallyselected. This can be explained by the fact that the codeword combiningof scheme ID reduces the outage probability with each retransmissionattempt allowing the transmitter to more aggressively select thetransmission rate—in some cases resulting in rates higher than ergodiccapacity.

2.3 Maximum ε Throughput with Scheme RT (MεT_(RT))

Many applications, including streaming video and voice, are sensitive todelay and jitter, the variance of the delay. These applications may notbe compatible with a possibly infinite number of transmission attemptsfor each codeword. Limiting the number of transmission attempts providesa tighter bound on delay and jitter at the cost of not guaranteeingsuccessful transmission of every codeword. This is illustrated bygeneralizing scheme RT, due to its analytical tractability, to at most Lattempts.

This is denoted as RT_(L).

With scheme RT_(L), $\begin{matrix}{{{Prob}\left( {S = s} \right)} = \left\{ {\begin{matrix}{\left\lbrack {P_{out}\left( {R,\mathcal{P}_{av},K} \right)} \right\rbrack^{s - 1}\left\lbrack {1 - {P_{out}\left( {R,\mathcal{P}_{av},K} \right)}} \right\rbrack} & {s < L} \\{{\left\lbrack {P_{out}\left( {R,\mathcal{P}_{av},K} \right)} \right\rbrack^{s - 1}\left\lbrack {1 - {P_{out}\left( {R,\mathcal{P}_{av},K} \right)}} \right\rbrack} + \left\lbrack {P_{out}\left( {R,\mathcal{P}_{av},K} \right)} \right\rbrack^{L}} & {s = L} \\0 & {s > L}\end{matrix}.} \right.} & (2.41)\end{matrix}$as the service time distribution. For s<L it is a geometric distributionwith parameter [1−P_(out)(R, P_(av), K)]. Since L is the maximum numberof transmission attempts, it is impossible for the service time toexceed L and thus the service time distribution is 0 for s>L. Finally,s=L consists of those codewords that are successfully transmitted withexactly L attempts or require more attempts and are in outage after Lattempts. This “effective outage” probability can be found by summingthe tail of a geometric distribution for s=L+1, . . . ,∞. That is,$\begin{matrix}\begin{matrix}{P_{out}^{eff} = {\sum\limits_{s = {L + 1}}^{\infty}\quad{\left\lbrack {P_{out}\left( {R,\mathcal{P}_{av},K} \right)} \right\rbrack^{s - 1}\left\lbrack {1 - {P_{out}\left( {R,\mathcal{P}_{av},K} \right)}} \right\rbrack}}} \\{= \left\lbrack {P_{out}\left( {R,\mathcal{P}_{av},K} \right)} \right\rbrack^{L}}\end{matrix} & (2.42)\end{matrix}$From (2.41), the expected service time is given by $\begin{matrix}{{E\lbrack S\rbrack} = {\frac{1 - \left\lbrack {P_{out}\left( {R,\mathcal{P}_{av},K} \right)} \right\rbrack^{L}}{\left\lbrack {1 - {P_{out}\left( {R,\mathcal{P}_{av},K} \right)}} \right\rbrack}.}} & (2.43)\end{matrix}$

The maximum ε-throughput for scheme RT_(L) (MεT_(RT) _(L) ) is definedas the highest achievable throughput using at most L transmissionattempts per codeword, with the effective outage probability no greaterthan ε; that is, $\begin{matrix}\begin{matrix}{{M\quad\varepsilon\quad{T_{{RT}_{L}}\left( {\mathcal{P}_{av},K,L} \right)}} = {\sup\limits_{R}\left\{ {{\frac{R\left\lbrack {1 - P_{out}^{eff}} \right\rbrack}{{\mathbb{E}}\lbrack S\rbrack}\text{:}P_{out}^{eff}} \leq \varepsilon} \right\}}} \\{= {\sup\limits_{R}{\left\{ {{{R\left\lbrack {1 - {P_{out}\left( {R,\mathcal{P}_{av},K} \right)}} \right\rbrack}\text{:}P_{out}^{eff}} \leq \varepsilon} \right\}.}}}\end{matrix} & (2.44)\end{matrix}$It is remarkable that MZT_(RT) _(L) is found by maximizing the sameobjective function as the one for MZT_(RT) in (2.4).

The only difference in finding MZT_(RT) and MεT_(RT) _(L) , is that theoptimization is performed over different sets of transmission rates: ForMZT_(RT) _(L) this set is restricted to those rates that result in aneffective outage probability less than a target ε. Clearly as L→∞,MεT_(RT) _(L) →MZT_(RT) since the constraint on the transmission ratedisappears and P_(out) _(eff)→0.

The effect of a limited number of transmission attempts can be seen inFIG. 20, which plots MεT_(RT) _(L) for various values of L and P_(out)^(eff)=0.01. Clearly as L increases the set of valid transmission ratesover which the optimization in (2.44) is performed increases, resultingin a throughput that approaches MZT_(RT). As L increases, and the set oftransmission rates over which the optimization in (2.44) is performedincludes the transmission rate that achieves MZT_(RT), then there is nothroughput benefit in further increasing L. This can be seen in FIG. 20in which MεT_(RT) _(L) for L=5 is the same as MZT_(RT).

3. Outage Minimization Under a Peak and Average Power Constraint

Outage minimization for a fixed transmission rate is a prerequisite steprequired in order to maximize the throughput of delay-limitedcommunication systems with CSI-RT. The solution under the short-term andlong-term average power constraints is described above. The fixed i ateoutage minimization under both peak and average power constraints arediscussed in this chapter.

3.1 Short-Term Average and Peak Power Constraints

Under the short-term average and peak power constraints, the minimumoutage probabilityminδ{P _(out)(R, γ, K):γεO_(K) ^(st)(P _(av) , P _(p))}  (3.1)is achieved by an optimal outage minimizing power allocation strategy{tilde over (γ)}^(st) ε O_(K) ^(st)(P_(av), P_(p)).Theorem 3.1.1. The power allocation strategy that satisfies theshort-term average and peak power constraints is given by$\begin{matrix}{{{\overset{\sim}{\gamma}}_{k}^{st}\left( \underset{\_}{\alpha} \right)} = \left\{ \begin{matrix}{\min\left( {{\max\left( {{{{\overset{\sim}{\lambda}}^{st}\left( \underset{\_}{\alpha} \right)} - \frac{1}{\alpha_{k}}},0} \right)},\mathcal{P}_{p}} \right)} & {{{if}\quad\mathcal{P}_{p}} > \mathcal{P}_{av}} \\\mathcal{P}_{p} & {{{if}\quad\mathcal{P}_{p}} \leq \mathcal{P}_{av}}\end{matrix} \right.} & (3.2)\end{matrix}$with {tilde over (λ)}^(st)(α) the solution to $\begin{matrix}{{\frac{1}{K}{\sum\limits_{k = 0}^{K - 1}\quad{\min\left( {{\max\left( {{{{\overset{\sim}{\lambda}}^{st}\left( \underset{\_}{\alpha} \right)} - \frac{1}{\alpha_{k}}},0} \right)},\mathcal{P}_{p}} \right)}}} = {\mathcal{P}_{av}.}} & (3.3)\end{matrix}$Proof. The power allocation policy that solves (3.1) is the same as thatwhich maximizes the K-block instantaneous capacity (15) for a codewordaffected by channel α. If P_(p)<P_(av) then (15) is trivially maximizedby always transmitting at the peak power, {tilde over (γ)}_(k)^(st)(α)=P_(p).

If P_(p)>P_(av), then (15) is maximized by solving $\begin{matrix}{\min\limits_{\underset{\_}{\gamma}}{\left\{ {{{- {\sum\limits_{k = 0}^{K - 1}\quad{{\log\left( {1 + {\alpha_{k}\gamma_{k}}} \right)}\text{:}0}}} \leq \gamma_{k} \leq \mathcal{P}_{p}},{{\sum\limits_{k = 0}^{K - 1}\quad\gamma_{k}} = {K\quad\mathcal{P}_{av}}}} \right\}.}} & (3.4)\end{matrix}$The Lagrangian functional is first set up in standard form$\begin{matrix}{J = {{- {\sum\limits_{k = 0}^{K - 1}{\log\left( {1 + {\alpha_{k}\gamma_{k}}} \right)}}} - {\sum\limits_{k = 0}^{K - 1}{\psi_{k}\gamma_{k}}} + {\sum\limits_{k = 0}^{K - 1}{\mu_{k}\left( {\gamma_{k} - P_{p}} \right)}} + {{\upsilon\left( {{\sum\limits_{k = 0}^{K - 1}\gamma_{k}} - P_{av}} \right)}.}}} & (3.5)\end{matrix}$Since both the objective function and set of feasible points awe convex,it is known that the Karush-Kuhn-Tucker (KKT) conditions are sufficientfor optimality. Therefore an.) feasible point that satisfies the KKTconditions is the globally optimal point that minimizes the objectivefunction. The optimal power allocation policy γ* and the associatedψ_(k)*, μ_(k)* and ν*, satisfy $\begin{matrix}{\gamma_{k}^{*} \geq 0} & \left( {3.6a} \right) \\{\gamma_{k}^{*} \leq P_{p}} & \left( {3.6b} \right) \\{{\sum\limits_{k = 0}^{K - 1}\gamma_{k}^{*}} = {K\quad P_{av}}} & \left( {3.6c} \right) \\{\psi_{k}^{*} \geq 0} & \left( {3.6d} \right) \\{\mu_{k}^{*} \geq 0} & \left( {3.6e} \right) \\{{\psi_{k}^{*}\gamma_{k}^{*}} = 0} & \left( {3.6f} \right) \\{{\mu_{k}^{*}\left( {\gamma_{k}^{*} - P_{p}} \right)} = 0} & \left( {3.6g} \right) \\{\frac{\partial J}{\partial\gamma_{k}^{*}} = {{\frac{- \alpha_{k}^{*}}{1 + {\alpha_{k}^{*}\gamma_{k}^{*}}} - \psi_{k}^{*} + \mu_{k}^{*} + \upsilon^{*}} = 0}} & \left( {3.6h} \right)\end{matrix}$with (3.6a), (3.6b), and (3.6c) the set of feasible points, (3.6d) and(3.6e) the non-negativity of the Lagrange multipliers, (3.6f) and (3.6g)the complimentary slackness condition, and (3.6h) the vanishing gradientof the Lagrangian at the optimal solution. It is clear that a solutionof the form $\begin{matrix}{{{\overset{\sim}{\gamma}}_{k}^{st}\left( \underset{\_}{\alpha} \right)} = {\gamma_{k}^{*} = {\min\left( {{\max\left( {{{{\overset{\sim}{\lambda}}^{st}\left( \underset{\_}{\alpha} \right)} - \frac{1}{\alpha_{k}}},0} \right)},P_{p}} \right)}}} & (3.7)\end{matrix}$with {tilde over (λ)}^(st)(α)=1/ν* the solution to $\begin{matrix}{{{\frac{1}{K}{\sum\limits_{k = 0}^{K - 1}{\min\left( {{\max\left( {{{{\overset{\sim}{\lambda}}^{st}\left( \underset{\_}{\alpha} \right)} - \frac{1}{\alpha_{k}}},0} \right)},P_{p}} \right)}}} = P_{av}},} & (3.8)\end{matrix}$satisfies (3.6a-3.6h). Thus, {tilde over (γ)}^(st) is the powerallocation strategy that minimizes outage probability under a peak andshort-term average power constraint. □

Note that when P_(p)>P_(av) the optimal solution has three regions. Aconstant power allocation of {tilde over (γ)}_(k) ^(st)(α)=P_(p) is usedwhen λ^(st)(α)−1/a_(k) ≧P _(p). Next, the watering solution) {tilde over(γ)}_(k) ^(st)(α)=λ^(st)(α)−1/a_(k) is applied when0<λ^(st)(α)−1/a_(k)<P_(p). Finally, no power is allocated {tilde over(γ)}_(k) ^(st)(α)=0 when 0>λ^(st)(α)−1/a_(k).

The functional implementation of this power control strategy isstraightforward The receiver relays CSI to the transmitter. If thecurrent channel is α, the transmitter encodes the current codeword atrate R with the power allocation vector {tilde over (γ)}^(st)(α). If thetransmission rate is higher than what the channel can support,R>C_(K)(R, {tilde over (γ)}^(st)(α)), then an outage is declared.

3.2 Long-Term Average and Peak Power Constraints

Under the short-term average and peak power constraints, the minimumoutage probabilitymin{P _(out)(R, γ, K):γεO_(K) ^(lt)(P _(av) , P _(p))}  (3.9)is achieved by an optimal outage minimizing power allocation strategy{tilde over (γ)}^(lt)ε O_(K) ^(lt)(P_(av), P_(p)).Theorem 3.2.1. The power allocation policy that minimizes outage underthe long-term average and peak power constraints is $\begin{matrix}{{{\overset{\sim}{\gamma}}^{lt}\left( \underset{\_}{\alpha} \right)} = \left\{ \begin{matrix}{{\overset{\sim}{\gamma}\left( \underset{\_}{\alpha} \right)},} & {{w/{prob}}\quad 1} & {{{if}\quad\underset{\_}{\alpha}} \notin {{{G\left( P_{p} \right)}\quad{and}\quad\left\langle {\underset{\_}{\overset{\sim}{\gamma}}\left( \underset{\_}{\alpha} \right)} \right\rangle} < s^{*}}} \\{{\overset{\sim}{\gamma}\left( \underset{\_}{\alpha} \right)},} & {{w/{prob}}\quad w^{*}} & {{{{if}\quad\underset{\_}{\alpha}} \notin {{G\left( P_{p} \right)}\quad{and}\quad\left\langle {\underset{\_}{\overset{\sim}{\gamma}}\left( \underset{\_}{\alpha} \right)} \right\rangle}} = s^{*}} \\{0,} & {{w/{prob}}\quad\left( {1 - w^{*}} \right)} & {{{{if}\quad\underset{\_}{\alpha}} \notin {{G\left( P_{p} \right)}\quad{and}\quad\left\langle {\underset{\_}{\overset{\sim}{\gamma}}\left( \underset{\_}{\alpha} \right)} \right\rangle}} = s^{*}} \\{0,} & {{w/{prob}}\quad 1} & {{{if}\quad\underset{\_}{\alpha}} \notin {{{G\left( P_{p} \right)}\quad{and}\quad\left\langle {\underset{\_}{\overset{\sim}{\gamma}}\left( \underset{\_}{\alpha} \right)} \right\rangle} > s^{*}}} \\{0,} & {{w/{prob}}\quad 1} & {{{if}\quad\underset{\_}{\alpha}} \notin {G\left( P_{p} \right)}}\end{matrix} \right.} & (3.10)\end{matrix}$for some subset of fading states G(P_(p))⊂R₊ ^(K), s*>0 and w*ε[0, 1]with{tilde over (γ)}_(k)(α)=min(max({tilde over (λ)}^(lt)(α)−1/α_(k),0),P_(p))  (3.11)and {tilde over (λ)}^(lt)(α) the solution to $\begin{matrix}{{\frac{1}{K}{\sum\limits_{k = 0}^{K - 1}{\log\left\lbrack {1 + {\alpha_{k}{\min\left( {{\max\left( {{{{\overset{\sim}{\lambda}}^{lt}\left( \underset{\_}{\alpha} \right)} - \frac{1}{\alpha_{k}}},0} \right)},P_{p}} \right)}}} \right\rbrack}}} = {R.}} & (3.12)\end{matrix}$Proof. Suppose γ* is the outage minimizing power control policy,γ*=arg min{P _(out)(R, γ, K):γεO_(K) ^(lt)(P _(av) , P _(p))}.  (3.13)For this minimum outage power allocation policy, the outage legionφ(R, K)={α:C_(K)(α, γ*(α))<R}  (3.14)is the set of channels that cannot support rate R. Let {tilde over (γ)}represent the power allocation strategy that presents outage withminimum power. That is,{tilde over (γ)}(α)=arg min{(γ(α)):C _(K)(α, γ(α))≧R}.  (3.15)Then by definition∫_(φ(R,K)) I _(F) [C _(K)(R, {tilde over (γ)}(α))<R]dF(α)≧˜_(φ(R,K)) I_(F) [C _(K)(R, γ*(α))<R]dF(α)  (3.16)andE _(α∉φ(R,K))[({tilde over (γ)}(α)]≦E _(α∉φ(R,K))[γ*(α)].Since γ* is the optimal solution to (3.13) then the inequalities in(3.16) and (3.17) become equalities and hence {tilde over (γ)} is alsoan optimal solution.

Since outage is minimized with respect to both a peak and average powerconstraint, there will be a subset of the outage region for which outagecannot be prevented even if the peak power is used in each of the Kblocks in the codeword. Denote this region by $\begin{matrix}{{G\left( P_{p} \right)} = \left\{ {{\underset{\_}{\alpha}\text{:}\frac{1}{K}{\sum\limits_{k = 0}^{K - 1}{\log\left( {1 + {\alpha_{k}P_{p}}} \right)}}} < R} \right\}} & (3.18)\end{matrix}$and note that it is a subset of the outage region, G(P_(p))⊂φ(R, K).Using this definition, two sets of fading states are definedR(s)={α∉G(P _(p)):({tilde over (γ)}(α)<s}  (3.19)and{overscore (R)}(s)={(α∉G(P _(p)):({tilde over (γ)}(α)≦s}  (3.20)that are differentiated by the average power allocated for each fadingstate using power allocation policy {tilde over (γ)}. The correspondingaverage power over these sets areP(s)=∫_(R(s))({tilde over (γ)}(α)dF(α)  (3.21)and{overscore (P)}(s)=∫_({overscore (R)}(s))({tilde over(γ)}(α)dF(α)  (3.22)Then by Lemma 3 in the optimal power allocation policy under the peakand long-term average power constraints for all α∉G(P_(p)) is$\begin{matrix}{{{\overset{\sim}{\gamma}}^{lt}\left( \underset{\_}{\alpha} \right)} = \left\{ \begin{matrix}{{\overset{\sim}{\gamma}\left( \underset{\_}{\alpha} \right)},} & {{w/{prob}}\quad 1} & {{{if}\quad\left\langle {\underset{\_}{\overset{\sim}{\gamma}}\left( \underset{\_}{\alpha} \right)} \right\rangle} < s^{*}} \\{{\overset{\sim}{\gamma}\left( \underset{\_}{\alpha} \right)},} & {{w/{prob}}\quad w^{*}} & {{{if}\quad\left\langle {\underset{\_}{\overset{\sim}{\gamma}}\left( \underset{\_}{\alpha} \right)} \right\rangle} = s^{*}} \\{0,} & {{w/{prob}}\quad\left( {1 - w^{*}} \right)} & {{{if}\quad\left\langle {\underset{\_}{\overset{\sim}{\gamma}}\left( \underset{\_}{\alpha} \right)} \right\rangle} = s^{*}} \\{0,} & {{w/{prob}}\quad 1} & {{{if}\quad\left\langle {\underset{\_}{\overset{\sim}{\gamma}}\left( \underset{\_}{\alpha} \right)} \right\rangle} > s^{*}}\end{matrix} \right.} & (3.23)\end{matrix}$wheres*=sup{s:P(s)<P _(av)}  (3.24)and $\begin{matrix}{w^{*} = {\frac{P_{av} - {P\left( s^{*} \right)}}{{\overset{\_}{P}\left( s^{*} \right)} - {P\left( s^{*} \right)}}.}} & (3.25)\end{matrix}$

The form of {tilde over (γ)}, the power allocation policy that preventsoutage with minimum power, can be determined by solving $\begin{matrix}{\min\limits_{\underset{-}{\gamma}}\left\{ {{{\frac{1}{K}{\sum\limits_{k = 0}^{K - 1}{\gamma_{k}\text{:}\frac{1}{K}{\sum\limits_{k = 0}^{K - 1}{\log\left( {1 + {\alpha_{k}\gamma_{k}}} \right)}}}}} = R},{0 \leq \gamma_{k} \leq P_{p}}} \right\}} & (3.26)\end{matrix}$Setting up the functional $\begin{matrix}{L = {{\frac{1}{K}{\sum\limits_{k = 0}^{K - 1}{\gamma\quad k}}} - {\sum\limits_{k = 0}^{K - 1}{\psi\quad k\quad\gamma\quad k}} + {\sum\limits_{k = 0}^{K - 1}{\mu_{k}\left( {\gamma_{k} - P_{p}} \right)}} + {\upsilon\left( {\frac{\sum\limits_{k = 0}^{K - 1}\quad{\log\left( {1 + {\alpha_{k}\gamma_{k}}} \right)}}{K} - R} \right)}}} & (3.27)\end{matrix}$and realizing that a convex objective function and convex set offeasible points ale obtained implies that the globally optimal powerallocation strategy {tilde over (γ)}, and the associated {tilde over(ψ)}_(k), {tilde over (μ)}_(k) and {tilde over (υ)}, satisfy the KKTconditions $\begin{matrix}{{\overset{\sim}{\gamma}}_{k} \geq 0} & \left( {3.28a} \right) \\{{\overset{\sim}{\gamma}}_{k} \leq P_{p}} & \left( {3.28b} \right) \\{{\frac{1}{K}{\sum\limits_{k = 0}^{K - 1}{\log\left( {1 + {\alpha_{k}{\overset{\sim}{\gamma}}_{k}}} \right)}}} = R} & \left( {3.28c} \right) \\{{\overset{\sim}{\psi}}_{k} \geq 0} & \left( {3.28d} \right) \\{{\overset{\sim}{\mu}}_{k} \geq 0} & \left( {3.28e} \right) \\{{{\overset{\sim}{\psi}}_{k}{\overset{\sim}{\gamma}}_{k}} = 0} & \left( {3.28f} \right) \\{{{\overset{\sim}{\mu}}_{k}\left( {{\overset{\sim}{\gamma}}_{k} - P_{p}} \right)} = 0} & \left( {3.28g} \right) \\{\frac{\partial L}{\partial{\overset{\sim}{\gamma}}_{k}} = {{\frac{1}{K} - {\overset{\sim}{\psi}}_{k} + {\overset{\sim}{\mu}}_{k} + {\frac{\overset{\sim}{\upsilon}}{K}\frac{\alpha_{k}}{1 + {\alpha_{k}{\overset{\sim}{\gamma}}_{k}}}}} = 0}} & \left( {3.28h} \right)\end{matrix}$with (3.28a-3.28h) corresponding to (3.6a-3.6h). A solution of the form$\begin{matrix}{{{\overset{\sim}{\gamma}}_{k}\left( \underset{\_}{\alpha} \right)} = {\min\left( {{\max\left( {{{{\overset{\sim}{\lambda}}^{lt}\left( \underset{\_}{\alpha} \right)} - \frac{1}{\alpha_{k}}},0} \right)},P_{p}} \right)}} & (3.29)\end{matrix}$with {tilde over (λ)}^(lt)(α)=−{tilde over (ν)} as the solution to$\begin{matrix}{{{\frac{1}{K}{\sum\limits_{k = 0}^{K - 1}{\log\left\lbrack {1 + {\alpha_{k}\quad{\min\left( {{\max\left( {{{{\overset{\sim}{\lambda}}^{lt}\left( \underset{\_}{\alpha} \right)} - \frac{1}{\alpha_{k}}},0} \right)},P_{p}} \right)}}} \right\rbrack}}} = R},} & (3.30)\end{matrix}$satisfies (3.28b-3.28h) and is therefore the power allocation policythat prevents outage with minimum power. Therefore {tilde over (γ)}^(lt)is the power allocation strategy that minimizes outage probability undera peak and long-term average power constraint. □

As with the short-term case the optimal solution has three regions Aconstant power allocation of {tilde over (γ)}_(k) ^(lt)(α)=P_(p) is usedwhen λ^(lt)(α)−1/α_(k)≧P_(p). The waterfilling solutionλ^(lt)(α)−1/α_(k)≧P_(p){tilde over (γ)}_(k) ^(lt)(α)=λ^(lt)(α)−1/α_(k)is applied when 0<λ^(lt)(α)−1/α_(k)<P_(p). Finally, no power isallocated, {tilde over (γ)}_(k) ^(lt)(α)=0, when 0>λ^(lt)(α)−1/α_(k).

The implementation of this transmission scheme is relatively simple. Thereceiver relays the condition of the channel α back to the transmitter.For the desired transmission rate R if αεG(P_(p)) then outage isimmediately declared. If α∉G(P_(p)), then the transmitter encodes thecodeword at rate R with power allocation {tilde over (γ)}(α) and if({tilde over (γ)}(α)>s* an outage is again declared. In the cases where({tilde over (γ)}(α)<s* and ({tilde over (γ)}(α) =s*, the codeword istransmitted with probability 1 and w*, respectively.

3.3 Simulation Results

Shown in FIG. 3A is the power allocated for a particular channel a undera peak power constraint of 12 dB and short-term and long-term averagepower constraints of 10 dB for a transmission rate of 2 nats/sec/Hz. Forthis α both situations require the power allocated in several of theblocks to reach the peak power. However, under the short-term powerconstraint, the average power in the codeword must not exceed P_(av) andtherefore several of the blocks in the codeword are allocated no power.As a result an outage is unavoidable for this α. Under the long-termpower constraint, since the average power in the codeword can exceedP_(av), enough power is allocated to prevent outage.

For a fixed transmission rate the power allocated under the long-termaverage power constraint tends to have a larger variance than thatallocated under the short-term average power constraint. This is due tothe fact that the average power for any particular codeword under thelong-term average power constraint can exceed P_(av), while it cannotunder the short-term average power constraint. The larger varianceresults in a larger portion of the transmitted signal exceeding the peakpower constraint were it present. Therefore, when a peak powerconstraint is additionally imposed, the transmitted signal under thelong-term average power constraint is affected to a greater degree. Thiscan be seen in FIG. 3B and FIG. 3C, which illustrates histograms of theallocated power for 10,000 transmitted codewords both with and without apeak power constraint. Clearly, when the additional peak powerconstraint is imposed the power allocation distribution changessignificantly under the long-term average power constraint.

Imposing a peak power constraint limits the ability of a communicationssystem to ensure reliable communication. That is, when a peak powerconstraint is imposed in addition to an average power constraint, theoutage probability will be higher than if only an average powerconstraint is present. FIG. 3D plots the outage probability vs. P_(av)for a fixed transmission rate. It is seen that with only an averagepower constraint a lower outage probability is achievable for the sameaverage power under the long-term constraint than under the short-termconstraint. However, the performance difference shrinks greatly when apeak power constraint is also imposed. For a fixed PAR, under both theshort-term and long-term average power scenarios the outage probabilityis much higher with the peak power constraint than without. However, itis seen that the short-term average power constraint is affected to alesser degree. This occurs because the variance is higher under thelong-term power scenario making it more susceptible to the peak powerconstraint. For a fixed P_(p) the outage probability curve plateaus asP_(av) increases. The closer P_(av) is to P_(p), the larger theperformance degradation. It is seen that the long-term average powerscenario plateau's for a smaller P_(av) than the short-term averagepower scenario, illustrating its sensitivity to the peak- powerconstraint.

FIG. 3D plots outage probability as a function of R for a fixed P_(av).Here again it is seen that when an additional peak power constraint isimposed, the outage probability increases For large values of R and/orsmall values of P_(p) the outage probability is higher than without apeak power constraint, This is most clearly seen for the long-term powerscenario with a peak power of P_(p)=16 dB. For R<3.5 nats/sec/Hz theoutage probability is nearly the same as that achieved without a peakconstraint, since R is relatively small and the power required for anychannel state is rarely limited by the peak power constraint For R>3.5nats/sec/Hz, the outage probability is higher than that achieved withouta peak constraint, since the power required for any channel state isoften limited by the peak power constraint.

FIG. 3F plots results analogous to FIG. 3E except under the short-termaverage and peak power constraints The outage probability under bothpeak and average power constraints is again larger than only under theaverage power constraint. However, as expected, the performance loss isnot as pronounced as under the long-term power constraint.

4. Throughput Maximization with Optimal Rate Selection and Power Control

The scenario in which both the transmitter and receiver have CSI isconsidered. When this occurs the transmitter knows prior to transmissionif an outage will occur. Scheme DT is proposed which delays transmissionuntil the channel condition allows successful decoding at the receiver.Also, since the transmitter knows the condition of the channel it canvary the transmit power accordingly. The average throughput is nowmaximized by optimally selecting the transmission rate and power controlstrategy.

For scheme DT the outage probabilities are independent from onetransmission attempt to the next, due to the fact that the channelstates are assumed i.i.d. in the BF-AWGN model. As such the service timedistribution, the probability that it will take s attempts forsuccessful transmission, isProb(S=s)=[P _(out)(R, γ, K)^(s−1)]1−P _(out)(R, γ, K)]  (4.1)for transmission rate R, coding delay K and power allocation policy γ.This implies that the service time distribution is geometric on thepositive integers with parameter [1−P _(out)(R, γ, K)]. Then$\begin{matrix}{{{\mathbb{E}}\lbrack S\rbrack} = \frac{1}{1 - {P_{out}\left( {R,\gamma,K} \right)}}} & (4.2)\end{matrix}$is the expected service time.

Using the form of the expected service time and the fact that throughputis the transmission rate over the expected service time, it is definedMZT _(DT)(P _(av) , K, P _(p))=sup sup{R[1−P _(out)(R, γ,K)]:γεO_(K)}  (4.3)as the maximum zero-outage throughput with scheme DT for a system withcoding delay K, average transmit power P_(av) and peak transmit powerP_(p). MZT_(DT)(P_(av), K) is denoted as the maximum throughput withouta peak power constraint or when P_(p)=∞.

MZT_(DT) is found by minimizing the outage probability for a giventransmission rate and then taking the supremum over all transmissionrates. Here the power allocation policy γ belongs to O_(K) which canrepresent any one of O_(K) ^(st)(P_(av)), O_(K) ^(lt)(P_(av)), O_(K)^(st)(P_(av), P_(p)) or O_(K) ^(lt)(P_(av), P_(p)), For any transmissionrate R there is an associated minimum outage probability ε that isachieved by using the appropriate outage minimizing power allocationstrategy Then, MZT_(DT) can be thought of as selecting the throughputmaximizing (R, ε) pair. For each power constraint, codewords are encodedusing the optimal transmission rate that is the maximizer of (4.3) andpower is allocated using the appropriate outage minimizing powerallocation strategy If the transmission rate is larger than theinstantaneous capacity, then an outage is declared and the transmissionof the codeword is delayed

Communications performance in fading channels has been quantifiedhistorically by ε-capacity. Typically the target outage probability isfixed to a small value such as ε=0.01. In practice it may be better froma throughput perspective not to fix the target outage probability. Thisis illustrated in Theorem 4.0.1.

Theorem 4.0.1. MZT_(DT) is always greater than or equal to thethroughput achieved by transmitting at e-capacity

Proof. For a fixed outage probability ε, the ε-capacityC _(t) ^(pc) :=sup sup{R:P _(out)(R, γ, K)≦ε, γεO_(K)}  (4.4)is found by optimally selecting R and γ with O_(K) ε {O_(K)^(st)(P_(av)), O_(K) ^(lt)(P_(av)) O_(K) ^(st)(P_(av), P_(p)), O_(K)^(lt)(P_(av), P_(p))}. For the outage minimizing power allocationstrategy, every transmission rate R=C_(c) ^(pc) corresponds to a minimumoutage probability ε. Conversely, this means every outage probability εcorresponds to a throughput maximizing transmission rate R=C_(c) ^(pc).Transmitting at R=C_(c) ^(pc) results in a throughputT _(c) =C _(c) ^(pc)(1−ε).  (4.5)Therefore (4.5) is a single point on the curveT _(DT)(R)=R[1−P_(out)(R, γ*, K)]  (4.6)with P_(out)(R, γ*, K) the minimum outage probability that achievablefor transmission rate R and coding delay K. SinceMZT _(DT) =sup{T _(DT)(R)}  (4.7)MZT_(DT)>T_(c),  (4.8)is obtained, completing the proof. □Corollary 4.0.2. MZT_(DT) is always greater than or equal to thethroughput achieved by transmitting at delay-limited capacityProof. This is trivially shown by setting ε=0 and applying Theorem4.0.1. □

Corollary 4.0.2 illustrates the power of the multi-attempt approach fordelay-limited systems. For the same coding delay K a higher throughputis achieved by allowing multiple, rather than a single, transmissionattempts per codeword. That is, MZT_(DT) is larger delay-limitedcapacity (T_(c) |_(c=0)). The cost of the improved throughput is aqueueing delay that is not present if the system is restricted to asingle transmission attempt per codeword.

4.1 Maximum Zero-Outage Throughput with Scheme DT (MZT_(DT)) UnderDifferent Power Constraints

MZT_(DT) is now examined under the short-term average and long-termaverage power constraints both with and without an additional peak powerconstraint. For each power constraint either O_(K) ^(st)(P_(av)), O_(K)^(lt)(P_(av)), O_(K) ^(st)(P_(av), P_(p)) or O_(K) ^(lt)(P_(av), P_(p))is substituted for O_(K)(P_(av), P_(p)) in (4.3). Then using the form ofthe outage minimizing power allocation strategy, (4.3) can be reduced toan optimization problem of only a single variable, the transmission rateR.Theorem 4.1.1. The maximum zero-outage throughput with the delayedtransmission scheme under the short-term average power constraint is$\begin{matrix}{{{MZT}_{DT}^{st}\left( {P_{av},K} \right)} = {\sup\limits_{R}\left( {R\quad{\mathbb{E}}_{\underset{\_}{\alpha}}\left\{ {I_{F}\left\lbrack {R \leq {\frac{1}{K}{\log\left( {1 + {\alpha_{k}\quad\max\left\{ {{{\lambda^{st}\left( \underset{\_}{\alpha} \right)} - \frac{1}{\alpha_{k}}},0} \right\}}} \right)}}} \right\rbrack} \right\}} \right)}} & (4.9)\end{matrix}$with λ^(st)(α) as the solution to $\begin{matrix}{{\sum\limits_{k = 0}^{K - 1}\quad{\max\left( {{{\lambda^{st}\left( \underset{\_}{\alpha} \right)} - \frac{1}{\alpha_{k}}},0} \right)}} = {K\quad{P_{av}.}}} & (4.10)\end{matrix}$Theorem 4.1.2. The maximum zero-outage throughput with the delayedtransmission scheme under both the short-term average and peak powerconstraints is $\begin{matrix}{{{MZT}_{DT}^{st}\left( {P_{av},K,P_{p}} \right)} = \left\{ \begin{matrix}{\sup_{R}\left( {R\quad{\mathbb{E}}_{\underset{\_}{\alpha}}\left\{ {I_{F}\left\lbrack {R \leq \frac{\log\left( {1 + {\alpha_{k}\xi}} \right)}{K}} \right\rbrack} \right\}} \right)} & {{{if}\quad P_{p}} > P_{av}} \\{\sup_{R}\left( {R\quad{\mathbb{E}}_{\underset{\_}{\alpha}}\left\{ {I_{F}\left\lbrack {R \leq \frac{\log\left( {1 + {\alpha_{k}P_{p}}} \right)}{K}} \right\rbrack} \right\}} \right)} & {{{if}\quad P_{p}} \leq P_{av}}\end{matrix} \right.} & (4.11)\end{matrix}$with ξ=min{max[{tilde over (λ)}^(st)(α)−1/α_(k),0], P_(p)} and {tildeover (λ)}^(st)(α) as the solution to $\begin{matrix}{{\sum\limits_{k = 0}^{K - 1}{\min\left( {{\max\left( {{{{\overset{\sim}{\lambda}}^{st}\left( \underset{\_}{\alpha} \right)} - \frac{1}{\alpha_{k}}},0} \right)},P_{p}} \right)}} = {K\quad{P_{av}.}}} & (4.12)\end{matrix}$Theorem 4.1.3. The maximum zero-outage throughput with the delayedtransmission scheme under the long-term average power constraint is$\begin{matrix}{{{MZT}_{DT}^{lt}\left( {P_{av},K} \right)} = {\sup\limits_{R}\left\{ {R\left\lbrack {{{\mathbb{E}}_{\underset{\_}{\alpha}}\left\{ {I_{F}\left( {{\frac{1}{K}{\sum\limits_{k = 0}^{K - 1}{\max\left( {{{\lambda^{lt}\left( \underset{\_}{\alpha} \right)} - \frac{1}{\alpha_{k}}},0} \right)}}} < s^{*}} \right)} \right\}} + {w^{*}{\mathbb{E}}_{\underset{\_}{\alpha}}\left\{ {I_{F}\left( {{\frac{1}{K}{\sum\limits_{k = 0}^{K - 1}{\max\left( {{{\lambda^{lt}\left( \underset{\_}{\alpha} \right)} - \frac{1}{\alpha_{k}}},0} \right)}}} = s^{*}} \right)} \right\}}} \right\rbrack} \right\}}} & (4.13)\end{matrix}$with λ^(lt)(α) as the solution to $\begin{matrix}{{{\frac{1}{K}{\sum\limits_{k = 0}^{K - 1}{\log\left\lbrack {1 + {\alpha_{k}\quad{\max\left( {{{\lambda^{lt}\left( \underset{\_}{\alpha} \right)} - \frac{1}{\alpha_{k}}},0} \right)}}} \right\rbrack}}} = R},} & (4.14)\end{matrix}$Theorem 4.1.4. The maximum zero-outage throughput with the delayedtransmission scheme under both the long-term average and peak powerconstraints isMZT _(DT) ^(lt)(P _(av) , K, P _(p))=sup{R[E _(α) {I_(F)(α∉G_(p))}][E_(α) {I _(F)(κ<s*)}+w*E _(α) {I _(F)(κ=s*)}]}  (4.15)with κ=1/KΣ_(K=0) ^(K−1) min[max(λ^(lt)(α)−1/α_(k), 0), P_(p)] and{tilde over (λ)}^(lt)(α) as the solution to $\begin{matrix}{{{\frac{1}{K}{\sum\limits_{k = 0}^{K - 1}{\log\left\lbrack {1 + {\alpha_{k}\quad{\min\left( {{\max\left( {{{{\overset{\sim}{\lambda}}^{lt}\left( \underset{\_}{\alpha} \right)} - \frac{1}{\alpha_{k}}},0} \right)},P_{p}} \right)}}} \right\rbrack}}} = R},} & (4.16)\end{matrix}$

For each power constraint, codewords are encoded using the optimaltransmission rates that are the optimizers to (4.1.1), (4.1.2), (4.1.3)and (4.1.4), respectively. Using the appropriate outage minimizing powerallocation strategy if the transmission rate is larger than theinstantaneous capacity then an outage is declared and transmission ofthe codeword delayed until a more favorable channel state arises.

4.2 Special cases of MZT_(DT)

Since the form of the outage minimizing power allocation policies arecomplicated functions of the channel state α, the expression forMZT_(DT) are even more complex. However, for K=1 and a χ₂ ² fadingprocess more explicit expressions for three of the four power allocationscenarios have been found.

For the short-term average power constraint it is possible to find theoptimal transmission rate and the maximum throughput.Theorem 4.2.1. If K=1 and the fading process a follows a χ₂ ²distribution, then $\begin{matrix}{{{MZT}_{DT}^{st}\left( {P_{av},1} \right)} = {{W\left( P_{av} \right)}{\mathbb{e}}^{- {(\frac{{\mathbb{e}}^{{W{(P_{av})}}_{- 1}}}{P_{av}})}}}} & (4.17)\end{matrix}$Proof. Since K=1 and the entire codeword spans a single block of theBF-AWGN channel, the outage minimizing power allocation is to use allthe power P_(av) within the codeword. In this case the solution is thesame as constant power allocation when K=1.

If α follows a χ₂ ² distribution, then 1−P_(out)(R, P_(av),1)=e^(−(sa−1/Pav)). Using this, T(R)=Re^(−(eR−1/Pav)). Taking thederivative with respect to R and equating with zero, it is seen thattransmission corresponding to the critical point is the solution toRe^(R)=P_(av). The solution to this is the optimal transmission rateR*=W(P_(av)). Substituting this back into T(R) (4.17) is obtained. FromTheorem 2.1.1 and Proposition 2.1.2 it is known that this solutioncorresponds to a unique maximum. From Theorem 2.1.1 and Proposition2.1.2 it is known that this solution corresponds to a unique maximum. □

When a peak power constraint is imposed in addition to the short-termaverage power constraint a similar result is obtained.Theorem 4.2.2. If K=1 and the channel fading a follows a χ₂ ²distribution, then $\begin{matrix}{{{MZT}_{DT}^{st}\left( {P_{av},1,P_{p}} \right)} = {{W(\gamma)}{\mathbb{e}}^{- {(\frac{{{\mathbb{e}}^{W}{(\gamma)}}_{- 1}}{\gamma})}}}} & (4.18)\end{matrix}$withγ=min(P _(av) , P _(p))  (4.19)Proof. If K=1 and the entire codeword spans a single block of the fadingchannel and is affected by only a single channel fade. The situation isthen the same as constant power allocation. The instantaneous capacityis maximized by allocating the maximum allowable power to the codeword,which isγ=min(P _(av) , P _(p))  (4.20)Then, by the procedure of Theorem 4.2.1, the optimal transmission rateis R*=W(γ) and (4.18) is obtained. □

Finally, in the case of a long-term average power constraint it ispossible to find sufficient conditions that the optimal transmissionrate R^(lt) and optimal power cutoff s_(R) _(lt) * satisfy.Theorem 4.2.3. If K=1 and the channel gains follow a χ₂ ² distribution,then $\begin{matrix}{{{{\mathbb{e}}^{R^{lt}}{E_{i}\left( {1,\frac{{\mathbb{e}}^{R^{lt}} - 1}{s_{R^{lt}}^{*}}} \right)}} = P_{av}},} & (4.21) \\{{\left( s_{R^{lt}}^{*} \right)^{2} - {P_{av}R^{lt}{\mathbb{e}}^{R^{lt}}{\mathbb{e}}^{(\frac{{\mathbb{e}}^{R^{lt}} - 1}{{}_{}^{s*}{}_{}^{}})}}} = 0} & (4.22)\end{matrix}$where E_(i)(1, x)=∫₁ ^(∞)e^(−xt)/t dt are sufficient conditions thatR^(lt) and s_(R) _(lt) * satisfy.Proof. Condition (4.21) is a sufficient condition for the optimal powercutoff s_(R) _(lt) *. It is obtained by finding the optimal short-termcutoff for the optimal transmission rate R^(lt). That is, finding the ssuch that P₁(s)=P_(av).

Condition (4.22) is also a sufficient condition that the optimaltransmission rate R^(lt) and power cutoff s_(R) _(lt) * satisfy. Fortransmission rate R and cutoff s_(R), P_(out)(R, P_(av),1)=1−e^(−(sR−1/)) is used which permits defining T(R)=Re^(−(sR−1/R)).Taking the derivative d[T(R)]/dR and setting to 0, it is seen that$\begin{matrix}{{s_{R} + {{R\left( \frac{{\mathbb{e}}^{R} - 1}{s} \right)}\frac{\mathbb{d}\left( s_{R} \right)}{\mathbb{d}R}} - {R\quad{\mathbb{e}}^{R}}} = 0.} & (4.23)\end{matrix}$By letting g(s, R)=P₁(s)−P_(av) and performing implicit differentiation$\frac{\mathbb{d}s_{R}}{\mathbb{d}R} = {- \frac{\frac{\mathbb{d}s}{\mathbb{d}R}}{\frac{\mathbb{d}s}{\mathbb{d}s_{R}}}}$it is determined that $\begin{matrix}{\frac{\mathbb{d}s_{R}}{\mathbb{d}R} = \frac{- {{\mathbb{e}}^{R}\left\lbrack {{E_{i}\left( {1,x} \right)} - {x\quad{E_{i}\left( {0,x} \right)}}} \right\rbrack}}{x^{2}{E_{i}\left( {0,x} \right)}}} & (4.24)\end{matrix}$where x=e^(R)−1/s_(R) and E_(i)(0, x)=∫₁ ^(∞)e^(−xl) dt. Substitutingthis back into (4.23) and setting s_(R)=s*_(R) _(lt) , it is determinedthat (4.22). □4.3 Examples and DiscussionMZT_(DT) quantifies the maximum throughput achievable with scheme DT. Asis the case for constant power transmission, the benefit of allowingmultiple (rather than a single) transmission attempts per codeword, withrate selection and power control, is an increased throughput for thesame coding delay. In this section, this concept is illustrated for thet2 fading process.4.3.1 Increased Throughput with the Multi-Attempt ApproachWithin the single-attempt paradigm the need for a measure of zero-outage(error-free) communication performance for delay-limited systems led tothe notion of delay-limited capacity, or ε-capacity with ε=0. When onlya single transmission attempt is allowed, the transmission rate R mustbe supported on all possible α. Thus, delay-limited capacity quantifiesthe error-free data rate that can be supported over all α in the supportof the fading process.

With CSI-RT, delay-limited capacity is always 0 for χ₂ ² fading whenK=1. However, when K>1 non-zero delay-limited capacity is possible. FIG.4A illustrates MZT_(DT) ^(lt) and delay-limited capacity as a functionof P_(av) for K 32 2, the smallest coding delay with non-zerodelay-limited capacity. For the same coding delay, MZT_(DT) ^(lt) ishigher than delay-limited capacity for all P_(av). The performancebenefits of the multi-attempt approach over the single-attempt approachis an increased throughput for the same coding delay.

4.3.2 Importance of Power Control

The conventional view about optimal power control is that it yields “anegligible [ergodic] capacity gain” over constant power transmission.This is quite evident when comparing C_(erg) and C_(erg-pc) as afunction of P_(av) in FIG. 1C. However, in FIG. 4B MZT_(DT) is plottedas a function of P_(av) with K=1 for the constant, short-term(equivalent to constant power allocation for K=1) and long-term powerallocation strategies. Comparing MZT_(DT) ^(const) and MZT_(DT) ^(lt),it is seen that the difference between the curves is large for allP_(av); power control is important for delay-limited systems. Thereforethe original statement about optimal power control should be qualified:Power control provides negligible performance gains fordelay-unconstrained systems, but for delay-limited systems the gains canbe significant.

The importance of power control is again shown in FIG. 4C which plotsMZT_(DT) with P_(av)=10 dB By observing MZT_(DT) ^(lt) as a function ofK, it is seen that the throughput, under scheme DT, with optimal rateand power control converges very quickly to ergodic capacity. In fact,MZT_(DT) ^(lt)=2.00 nats/sec/Hz when K=10, is just slightly lower thanC_(erg-pc)=2.07 nats/sec/Hz, achievable only when K=∞. Again, thisillustrates that power control is more important than large codingdelays for maximizing throughput. For example, a target throughput of 1nat/sec/Hz is achieved with K=1 under the long-term average powerconstraint, but is not achievable even with K=100 for constant powertransmission. It is also worth noting that the more relaxed the powerconstraint the higher the throughput, i.e., MZT_(DT) ^(const)≦MZT_(DT)^(st)≦MZT_(DT) ^(lt). This relation holds for any coding delay K sincethe constant power allocation is a special case of the short-term powerallocation which in turn is a special case of the long-term powerallocation strategy.

To reemphasize the importance of power control FIG. 4B is examined. ForK=1 and a throughput of 1 nat/sec/Hz, MZT_(DT) ^(lt) is only about 0.5dB away from ergodic capacity with power control, C_(erg-pc) for whichK=∞. More surprisingly, for low SNR it is even greater than the ergodiccapacity without constant power, C_(erg-const). In this SNR region, abetter average throughput is achieved, MZT_(DT) ^(lt) for K=1 with thedelayed transmission scheme and power control than for K=∞ with constantpower allocation and the single-attempt approach, C_(erg-pc). Thisimplies that optimal power control is more important than the number offading states affecting each codeword (ergodicity).

4.3.3 Importance of Rate Selection

As is the case for constant power transmission, the simulation resultsfor variable power transmission show that the transmission rate may beselected carefully in order to maximize throughput. Selecting asuboptimal transmission rate can result in a throughput much smallerthan MZT_(DT), This can be seen in FIG. 4D which plots the averagethroughput achieved with scheme DT as a function of transmission ratefor the constant, short-term and long-term average power allocationstrategies.

The peak of each curve corresponds to MZT_(DT). It is also seen that thelarger the coding delay K the larger the drop in throughput is if theoptimal transmission rate is overshot. Therefore case must be taken tosolve (4.3) and select the appropriate transmission rate for the powerallocation policy at hand.

FIG. 4E plots the optimal transmission rate, corresponding to MZT_(DT),as a function of coding delay K. The optimal transmission rate,especially for small K, can fluctuate a great deal. In fact a verynon-intuitive phenomenon is observed—in some cases, the optimaltransmission rate can actually be higher than ergodic capacity. Forexample, when K=1 the optimal transmission rate under the long-termaverage power constraint, R^(lt)=2.51 nats/sec/Hz, is more than 21%higher than the ergodic capacity of the channel, C_(erg-pc)=2.07nats/sec/Hz. This is counter to common practice, where a transmissionrate lower than capacity is normally used. This is not a violation ofthe ergodic capacity theorem, since the resulting throughput isvirtually always less than ergodic capacity.

For a given power allocation policy, either the transmission rate of theoutage probability, but not both, can be freely selected since theydepend on one another. FIG. 4F plots the outage probability associatedwith the optimal transmission rate. It is seen that the optimal outageprobability can be substantially high as was shown for constant powertransmission. In fact, for P_(av)=10 dB the optimal outage probabilitywhen K=1 is, 0.37 and 0.27 for the short-term and long-term averagepower constraints. This is interesting because it is counter toconventional practice; in most communication literature E-capacity isnormally measured for a small outage probability such as ε=0.01.However, it is seen that in order to maximize throughput the outageprobability should be much higher.

4.3.4 Effect of a Peak Power Constraint

As discussed above, a peak power constraint can reduce the ability of adelay-limited communication system to prevent outage events, resultingin higher outage probabilities for the same transmission rate andaverage power. This will affect the MZT_(DT) of the system as well. Herethe effect of the peak power constraint on the long-term average powerscenario is illustrated. Similar results, though to a lesser degree, canbe observed for the short-term power scenario.

A peak power constraint limits the maximum throughput. FIG. 4G showMZT_(DT) ^(lt) when K=5 as a function of P_(av) both with and without apeak power constraint. Also plotted for reference is ergodic capacity(without a peak power constraint). MZT_(DT) ^(lt)(P_(av), K, P_(p)) isnearly identical to MZT_(DT) ^(lt)(P_(av), K) when P_(av)<<P_(p).However, for larger P_(av) the outage probability, and thereforeMZT_(DT) ^(lt)(P_(av), K, P_(p)), becomes limited by the peak powerconstraint. Further increasing the average power does not increaseMZT_(DT) ^(lt)(P_(av), K, P_(p)) as the peak power constraint will notallow improvements in the minimum outage probability. FIG. 4G alsoillustrates MZT_(DT) ^(lt) for a fixed PAR. Here, MZT_(DT) ^(lt)continues to increase with P_(av), but the effect of the PAR restrictionis obvious—MZT_(DT) ^(lt) is less than that obtained without a peakpower constraint. FIG. 4H illustrates the analogous results for theshort-term power constraint. The same effects are present but are not aspronounced due to the fact that the short-term amperage power constraintis less affected by an additional peak power constraint than thelong-term average power constraint.

MZT_(DT) ^(lt)(P_(av), K, P_(p)), is plotted against coding delay K forP_(av)=10 dB and various values of P_(p) in FIG. 4I(b). The smaller theP_(p), the further MZT_(DT) ^(lt()P_(av), P_(p), K) is from MZT_(DT)^(lt)(P_(av), K). An interesting phenomenon is observed as K increases;it is seen that the peak power constraint affects the maximumthroughput, and hence the outage probability, to a lesser degree. Thisis explained by the fact that the likelihood of a substantially poorchannel α decreases for large K. Hence, the likelihood of a powerallocation vector which hits the peak power in several blocks alsodecreases and the effect of the peak power constraint diminishes. Thesame phenomenon can be seen with the short-term power constraint in FIG.4J, though to a lesser degree.

Properly selecting the transmission rate remains important when a peakpower constraint is imposed. FIG. 4K plots the throughput againsttransmission rate with a long-term average power constraint fordifferent values of P_(p). It is critical to select the transmissionrate that corresponds to MZT_(DT) ^(lt)(P_(av), K, P_(p)), since asuboptimal selection can yield a large throughput drop. The effect ofthe peak power constraint is clearly seen on the throughput—small valuesof P_(p) and/or for large values of R the transmitted signal is peaklimited. That is, the throughput is less that that if there is no peakpower constraint. This same phenomenon is observed under the short-termpower constraint in FIG. 4L though not seen to the same degree as underthe long-term power constraint

FIG. 4M and FIG. 4O show the optimal transmission rate and theassociated outage probability as a function of K for various P_(p) underthe long-term average power constraint. The optimal transmission ratecan be higher than ergodic capacity and the optimal outage probabilitycan be high Both observations run counter to conventional practice. As Kgrows, the difference in the optimal transmission rates, with andwithout a peak power constraint, decreases. FIG. 4N and FIG. 4Pillustrate the analogous results under the short-term average powerconstraint.

5. Throughput Maximization with Queueing Delay Constraints

The throughput maximization analysis described previously measuredcommunication performance under the multi-attempt paradigm. The averagethroughput for schemes RT, ID and DT are maximized By allowing multipletransmission attempts per codeword, zero-outage communication ispossible for finite coding delay K. This is often not the case for thesingle-attempt approach, which often results in zero throughput forfinite K. The improved throughput achieved with the multi-attemptapproach does not come without any cost. The cost is a queueing delaydue to the random nature of the fading channel that is not present withthe single transmission attempt approach.

5.1 Mathematical Formulation

In the following queueing analysis a slotted transmission system inwhich “time” is measured in multiples of the channel coherence time, orblocks of N transmitted symbols in the BF-AWGN channel model is assumed.A codeword transmission attempt that requires 1 slot which correspondsto K blocks if the coding delay is K is assumed. A simple Bernoulliarrival process in which either zero or one codeword arrives into thequeue in any slot is also assumed. The arrival process has thedistribution $\begin{matrix}{{f_{a}(n)} = {{{Prob}\left( {n\quad{arrivals}} \right)} = \left\{ \begin{matrix}{a,} & {n = 1} \\{{1 - a},} & {{n = 0},}\end{matrix} \right.}} & (5.1)\end{matrix}$with average arrival rate, E[n]=α, the average number of codewordsarriving in any particular slot. The average service rate, 1/E[S], isthe average number of codewords serviced by the server in any particularslot. Then the queue utilization factor $\begin{matrix}{\rho:={\frac{{average}\quad{arrival}\quad{rate}}{{average}\quad{service}\quad{rate}} = {a\quad{{\mathbb{E}}\lbrack S\rbrack}}}} & (5.2)\end{matrix}$represents the proportion of time that the server is busy. Factoring thequeue utilization yields $\begin{matrix}{{T^{LT}\left( {R,P_{av},K,a} \right)} = {{\rho\left( {a,R} \right)}\frac{R}{{\mathbb{E}}\lbrack S\rbrack}}} & (5.3)\end{matrix}$as the long-term average throughput for a particular transmission rateR. The formulation is similar to (1.2) except for the scaling factorρ(α, R) that accounts for the proportion of time the server in the queueis busy For example, if the codeword arrival rate and transmission rateare such that the throughput is 2 nats/sec/Hz but ρ=½, implying that theserver is busy only half of the time, then the long-term averagethroughput is 1 nat/sec/Hz.

The communications throughput without a constraint on the queueing delaywas previously maximized. As such, the queue utilization factor was ρ=1,and the server was always busy either transmitting or retransmittingcodewords This implies that average arrival rate of codewords into thequeue is equal to the average service rate of the codewords. Thisapproach limits the coding delay to K blocks and provides the maximumthroughput for a particular retransmission scheme without a constrainton the queueing delay. In many applications, such as video or voice,excessive queueing delays cannot be tolerated. For these systemsoperating at the maximum throughput T_(max)(P_(av), K) (1.2) is notfeasible as it would lead to excessive delay. For such applications thearrival rate and coding rate may be adjusted to ensure that the queueingdelay is not excessive.

The expected waiting-time, or delay, is the amount of time that acodeword spends in the system (either in the queue or under service).One way to constrain the queue length is to constrain the expectedwaiting-time of codewords that arrive into the system. Illustrate areconstant power transmission and schemes RT since it is the mostanalytically tractable; however, similar results can be derived forother multi-attempt schemes both with and without power control. Theproblem of throughput maximization with a waiting-time constraint can bestated as $\begin{matrix}{{T_{\max}^{D}\left( {P_{av},K} \right)} = {\sup\limits_{a,R}\left\{ {{{\rho\left( {a,R} \right)}\frac{R}{{\mathbb{E}}\lbrack S\rbrack}\text{:}\quad{{\mathbb{E}}\lbrack W\rbrack}} \leq D} \right\}}} & (5.4)\end{matrix}$where E[W] is the expected waiting time for each codeword entering thesystem and the supremum is taken over all valid arrival rates, a, andtransmission rates, R. (5.4) is examined in detail for scheme RT.5.2 Optimal Throughput Maximization with Queuing Delay ConstraintsSince the arrival process is Bernoulli, the interarrival timedistribution, the distribution of the number of slots betweenconsecutive codeword arrivals, is geometric with parameter α. FromSection 2.1, it is known that the service time distribution for schemeRT is also geometric with parameter [1−P_(out)(R, P_(av), K)]. Sinceboth the interarrival and service times are geometrically distributed,the communications system can be modeled as a discrete-time Geo/Geo/1queue.

The expected waiting time for a Geo/Geo/1 queue, in terms of blocks inthe BF-AWGN model, is $\begin{matrix}{{{\mathbb{E}}\lbrack W\rbrack} = \frac{K}{1 - \lambda}} & (5.5)\end{matrix}$where $\begin{matrix}{\lambda:={\frac{1 - {{average}\quad{service}\quad{rate}}}{1 - {{average}\quad{arrival}\quad{rate}}} = \frac{P_{out}\left( {R,P,K} \right)}{1 - a}}} & (5.6)\end{matrix}$

The queue utilization for scheme RT can be written as $\begin{matrix}{\rho = {\frac{a}{\left\lbrack {1 - {P_{out}\left( {R,P_{av},K} \right)}} \right\rbrack}.}} & (5.7)\end{matrix}$Without a queueing delay constraint ρ=1 and the optimal transmissionrate as R_(MZT) _(RT) *. From (5.7) when ρ(α, R)=1 the resulting optimalcodeword arrival rate isα_(MZT) _(RT) *=1−P _(out)(R _(MZT) _(RT) *, P _(av) , K).  (5.8)

Using (5.4)-(5.6) the maximum zero-outage throughput for coding delay Kand power P_(av) with a constraint D on the amperage waiting time can bewritten as $\begin{matrix}{{{MZT}_{RT}^{D}\left( {P_{av},K} \right)} = {\sup\limits_{a,R}{\left\{ {{{aR}\text{:}\quad\frac{K\left( {1 - a} \right)}{1 - a - {P_{out}\left( {R,P_{av},K} \right)}}} \leq D} \right\}.}}} & (5.9)\end{matrix}$Throughput is maximized by optimally selecting the codeword arrival rateα and the coding rate R with the constraint that the expectedwaiting-time (delay) is less than D blocks. This problem can easily beconverted to constrain the number of codewords in the communicationsystem by applying Little's theoremM=αD  (5.10)with M representing the number of codewords in the system, either in thequeue or being served. In general the transmitter does not have controlof the codeword arrival rate since data is generated by applications notunder the control of the communication system. However, it is possibleto optimize over the arrival rate a in order to determine the optimalrate that applications should generate data. Though the objectivefunction in (5.9) is convex, the set of feasible points for theoptimization problem is not and therefore (5.9) is not a convexoptimization problem. Thus, a unique solution to (5.9) may not exist.

FIG. 5A plots the optimal MZT_(RT) ^(D)(P_(av), K) as a function of themaximum average waiting time D, for K=1 and P_(av)=10 MZT_(RT)^(D)(P_(av), K) was found by exhaustive search over the variables R andα. It is seen that the maximum through-put approaches MZT_(RT)(K,P_(av)) as the constraint on the waiting time is relaxed, i.e. D→∞.Though it cannot be shown explicitly, since no closed form for (5.9)exists, the convergence of MZT_(RT) ^(D)(P_(av), K) to MZT_(RT)(K,P_(av)) appears monotonic. This figure is particularly useful as itallows the prediction of the best case performance of a communicationsystem using retransmission scheme RT with both a finite coding delay Kand a finite waiting-time D. It is also interesting to note that forsmall D≈10 the maximum throughput with a waiting-time constraintapproaches that obtained without a waiting-time constraint.

For K=1 and P_(av)=10 dB, the optimal transmission rate R_(MZT) _(RT)_(D) * and codeword arrival rate α_(MZT) _(RT) _(D) * that maximize(5.9) are shown in FIGS. 5B and 5C, respectively. It is seen thatR_(MZT) _(RT) _(D) *→R_(MZT) _(RT) * and α_(MZT) _(RT) _(D) *→α_(MZT)_(RT) * as D→∞. However, the convergence is not monotonic and theoptimal values of R_(MZT) _(RT) _(D) * and α_(MZT) _(RT) _(D) * canfluctuate as a function of D. This is due to the non-convexity of theoriginal problem (5.9). For small D that R_(MZT) _(RT) _(D) * is quitefar from the optimal R_(MZT) _(RT) * while α_(MZT) _(RT) _(D) * is notfar from α_(MZT) _(RT) *. Thus, in order to maximize the throughputwhile constraining the average waiting-time, the coding rate rather thanthe codeword arrival rate should be reduced; the frequency of codewordarrivals should be left unchanged while the amount of information ineach codeword should be reduced. A reduction in the codeword arrivalrate reduces the throughput to a greater extent than a reduction in thecoding rate. This is non-intuitive as conventional flow-controlalgorithms, such as TCP, reduce the frequency of packet generation whenlarge queues build in communication networks. The difference isreconciled by the fact that the underlying cause for the buildup ofqueues is different. The action that TCP takes is motivated by theassumption that queues build due to congestion in the network—thatpackets are being generated faster than the network can handle themHowever, queues in fading channels grow due to the frequency of codewordgeneration and the fact that the medium itself is unreliable. Forexample, if the channel condition remains poor for 10 consecutive slots(resulting in outages for 10 consecutive transmission attempts) and zeronew codewords arrive into the queue, then the queue size remainsunchanged. However, if the link is assumed reliable and if zerocodewords arrive into the queue fox 10 consecutive slots, then the queuesize shrinks by 10. This concept allows for a novel method forwaiting-time/delay (or queue-length) management in fading channels: Ifthe average waiting-time is large then it can be reduced by using asmaller coding rate (codewords with a smaller amount of data) at thetransmitter. Conversely, a larger coding rate (more information percodeword) can be used at the transmitter to increase communicationsthroughput at the expense of a larger waiting-time.

For the optimal coding rate and codeword arrival rate the correspondingqueue utilization p(a, R) is plotted in FIG. 5D as a function of D. Itis seen that in order to satisfy a smaller waiting-time constraint D thequeue utilization is lowered until the delay constraint is met. FromFIGS. 5B and 5C it is seen that this is accomplished by reducing thetransmission rate rather than the arrival rate. This is preferable tothe opposite situation (reducing the arrival rate while keeping thetransmission rate constant) since it yields a larger throughput for thesame waiting-time.

5.3 Near-Optimal Throughput Maximization with Queuing Delay Constraints

Since (5.9) is not a convex optimization problem it may have many localmaxima. As such, numerical techniques to solve (5.9) may not converge tothe globally optimal solution. In this situation a near-optimaloptimization problem that is amenable to a numerical solution isdesirable

The optimal arrival rate α_(MZT) _(RT) _(D) * in (5.9) does not deviategreatly from α_(MZT) _(RT) * as a function of D. This phenomenon can beseen in FIG. 5C. Similarly, it is seen from FIG. 5B that the optimaltransmission rate drops significantly for small D. Clearly adjusting Rrather than a is more important for controlling the waiting-time D whilemaximizing the throughput. Therefore, it makes sense for the arrivalrate to be fixed α=α_(MZT) _(RT) * and the optimization to only beperformed over the transmission rate R.

For a fixed arrival rate α=α_(MZT) _(RT) * near-optimal nMZT_(RT) ^(D)for coding delay K and power P_(av) with average waiting time D isdefined as $\begin{matrix}{{{nMZT}_{RT}^{D}\left( {P_{av},K} \right)} = {\sup\limits_{R}{\left\{ {{a_{{MZT}_{RT}}^{*}R\text{:}\quad\frac{1 - a_{{MZT}_{RT}}^{*}}{1 - a_{{MZT}_{RT}}^{*} - {P_{out}\left( {R,P_{av},K} \right)}}} \leq D} \right\}.}}} & (5.11)\end{matrix}$This is a convex optimization problem since both the objective functionand set of feasible points are convex. Therefore, a globally optimalsolution to (5.11) exists. The existence of a near-optimal convexoptimization problem is also useful since both (5.9) and (5.11) must besolved numerically; if an optimization algorithm converges to a localmaxima in both cases, then it is the globally optimal solution to (5.11)while it may not be for (5-9)

When K=1 and the channel fading is χ₂ ² a closed form solution to thenear-optimal (5.11) can be found.Theorem 5.3.1. If K=1 and the channel fading process follows α χ₂ ²fading distribution then $\begin{matrix}{{{nMZT}_{RT}^{D}\left( {P_{av},1} \right)} = {a_{{MZT}_{RT}}^{*}{{\log\left( {1 - {P_{av}{\log\left( \frac{1 - {a_{{MZT}_{RT}}^{*}\left( {1 - D} \right)}}{D} \right)}}} \right)}.}}} & (5.12)\end{matrix}$Proof. To begin $\begin{matrix}{{{nMZT}_{RT}^{D}\left( {P_{av},1} \right)} = {\sup\limits_{R}{\left\{ {{a_{{MZT}_{RT}}^{*}R\text{:}\quad\frac{1 - a_{{MZT}_{RT}}^{*}}{1 - a_{{MZT}_{RT}}^{*} - {P_{out}\left( {R,P_{av},1} \right)}}} \leq D} \right\}.}}} & (5.13)\end{matrix}$For K=1 and χ₂ ² fadingP _(out)(R, P _(av), 1)=1−e ^(−(sR−1/Pav)).  (5.14)Substituting this into the waiting-time constraint $\begin{matrix}{\frac{1 - a_{{MZT}_{RT}}^{*}}{{\mathbb{e}}^{- {(\frac{c^{R} - 1}{P_{av}})}} - a_{{MZT}_{RT}}^{*}} \leq D} & (5.15)\end{matrix}$which after some algebraic manipulation yields $\begin{matrix}{\mathcal{R} \leq {{\log\left( {1 - {\mathcal{P}_{av}{\log\left( \frac{1 - {a_{{MZT}_{RT}}^{*}\left( {1 - \mathcal{D}} \right.}}{\mathcal{D}} \right)}}} \right)}.}} & (5.16)\end{matrix}$Clearly the linear objective function in (5.13) is maximized bysatisfying (5.16) with equality, resulting in (5.12). □

In FIG. 5A it is seen that the near-optimal nMZT_(RT) ^(D)(P_(av), K)achieved by only varying the coding rate performs nearly as well asMZT_(RT) ^(D)(P_(av), K) achieved by optimizing the transmission rateand codeword arrival rate. A reduction in either α and R reduces boththe expected waiting-time and maximum throughput. However, the maximumthroughput suffers a great deal more if a rather than R is reduced,explaining the fact that α_(MZT) _(RT) *∞α_(MZT) _(RT) _(D) *.

FIG. 5B compares the optimal R_(MZT) _(RT) _(D) * and near-optimalR_(nmZT) _(RT) _(D) * transmission rates as a function of thewaiting-time D. As with R_(MZT) _(RT) _(D) *, R_(nMZT) _(RT) _(D) *converges to R_(MZT) _(RT) * as D→∞. For K=1 this can be analyticallyseen by the fact that the throughput maximizing transmission rate$\begin{matrix}\begin{matrix}{\mathcal{R}_{{nMZT}_{RT}^{D}}^{*} = {\log\left( {1 - {\mathcal{P}_{av}{\log\left( \frac{1 - {a_{{MZT}_{RT}}^{*}\left( {1 - \mathcal{D}} \right)}}{\mathcal{D}} \right)}}} \right)}} \\{\overset{D\rightarrow\infty}{\quad =}{\log\left( {1 - {\mathcal{P}_{av}{\log\left( a_{{MZT}_{RT}}^{*} \right)}}} \right)}} \\{= {R_{{MZT}_{RT}}^{*}.}}\end{matrix} & (5.17)\end{matrix}$

The near-optimal queue utilization is compared with the optimal queueutilization in FIG. 5D for K=1 and P_(av)=10 dB. Both are similaralthough the near-optimal one is obtained by varying only thetransmission rate and not the codeword arrival rate.

5.4 The Queueing Delay vs. coding Delay Tradeoff

Above, the waiting-time is constrained to be less than D for a fixedcoding delay of K. However, some applications using a communicationsystem are affected by the total delay and it does not matter whetherthe delay is spent in coding or queueing. For small K retransmissionsare less costly in terms of delay but the instantaneous capacity, butthe amount of information that can be reliably transmitted with eachcodeword, is small For large K the opposite is true, the instantaneouscapacity is larger but retransmission is more costly in terms of delay.By optimizing over the coding delay K, an optimal balance can be struck.

Using this idea, it is possible to define $\begin{matrix}{{{MZT}_{RT}^{D}\left( \mathcal{P}_{av} \right)} = {\sup\limits_{K}\left\{ {{nMZT}_{RT}^{D}\left( {\mathcal{P}_{av},K} \right)} \right\}}} & (5.18)\end{matrix}$and $\begin{matrix}{{{nMZT}_{RT}^{D}\left( \mathcal{P}_{av} \right)} = {\sup\limits_{K}\left\{ {{nMZT}_{RT}^{D}\left( {\mathcal{P}_{av},K} \right)} \right\}}} & (5.19)\end{matrix}$as the highest optimal and near-optimal throughput for P_(av) andaverage waiting-time D. These quantities are achieved by solving (5.9)and (5.11) for each value of Kε{1, 2, . . . , D} and then taking thesupremum, over K, of these values.

The tradeoff between coding delay and queueing delay is illustrated inFIG. 5E, which plots MZT_(RT) ^(D)(P_(av), K) and nMZT_(RT) ^(D)(P_(av),K) as a function of K for D=20 and P_(av)=10 dB

To the end-user the average waiting time is D=20 for each coding delayK, however, the throughput is not. By optimizing over K, the throughputcan be maximized without any effect on the average waiting-time of endusers. In this case it is seen that there is a unique coding delay,K=16, that corresponds to MZT_(RT) ^(D)(P_(av)) and nMZT_(RT)^(D)(P_(av)), respectively. This indicates that for a total waiting-timeof D=20 that the coding delay should be set to K=16 and the codewordarrival and transmission rates found by solving (5.9) and (5.11),respectively. Also note that only zero throughput is achievable withK=20 for both the optimal and near-optimal techniques. This is due tothe fact that the minimum delay is D=20 since K=20 and a retransmissionof any codeword would violate the average waiting-time constraint. Sinceretransmissions are not permitted in this case and zero-outagecommunication is not possible with a single-transmission attempt(delay-limited capacity is zero), the throughput is zero.

FIG. 5F shows a flow diagram of a technique used to optimize throughputduring data transmission over a wireless channel. The technique may beimplemented by first characterizing the channel (block 502).Characterizing the channel includes identifying a coherence time for thechannel, identifying a noise power, and modeling a channel gainprobability density function. Constraints on retransmissions, power, anddelay then are determined (block 504) to establish the parameters withinwhich subsequent optimization calculations are to be performed. Expectedservice time is subsequently formulated in terms of data rate, power andcoding delay (block 506). The expected service time is used incalculations to determine a data rate, power allocation, and codingdelay that optimize throughput (block 508). Results of thesecalculations are used to transmit data at the optimal rate, coding delayand power allocation (block 510), thus optimizing throughput.

1. A communication method comprising: characterizing a communicationschannel; determining a data rate that maximizes channel throughput; andconfiguring a transmitter to send a transmit signal with said data rate.2. The method of claim 1, wherein said determining further includesdetermining a power allocation strategy that jointly maximizes thechannel throughput with said data rate.
 3. The method of claim 2,wherein said power allocation strategy provides for adjustment of thetransmit power to compensate for channel gain variation.
 4. The methodof claim 3, wherein said power allocation strategy minimizes outageprobability subject to peak power and average power constraints.
 5. Themethod of claim 1, wherein the power allocation strategy sets γ_(k), atransmit gain for a kth interval, to${{\gamma_{k}\left( \underset{\_}{\alpha} \right)} = {\min\left( {{\max\left( {{{\lambda\left( \underset{\_}{\alpha} \right)} - \frac{1}{\alpha_{k}}},0} \right)},P_{p}} \right)}},$wherein α a vector of the channel attenuation α_(k) for the last Kintervals, P_(p) is a peak power constraint, and λ(α) is the solution to${\frac{1}{K}{\sum\limits_{k = 0}^{K - 1}{\min\left( {{\max\left( {{{\lambda\left( \underset{\_}{\alpha} \right)} - \frac{1}{\alpha_{k}}},0} \right)}P_{p}} \right)}}} = {P_{av}.}$6. The method of claim 1, wherein the power allocation strategystochastically sets γ_(k), a transmit gain for a kth interval, to${{\gamma_{k}\left( \underset{\_}{\alpha} \right)} = {{0\quad{or}\quad{\gamma_{k}\left( \underset{\_}{\alpha} \right)}} = {\min\left( {{\max\left( {{{\lambda\left( \underset{\_}{\alpha} \right)} - \frac{1}{\alpha_{k}}},0} \right)},P_{p}} \right)}}},$wherein α a vector of the channel attenuation α_(k) for the last Kintervals, P_(p) is a peak power constraint, and λ(α) is the solution to${{\frac{1}{K}{\sum\limits_{k = 0}^{K - 1}{\log\left\lbrack {1 + {\alpha_{k}{\min\left( {{\max\left( {{{\lambda\left( \underset{\_}{\alpha} \right)} - \frac{1}{\alpha_{k}}},0} \right)},P_{p}} \right)}}} \right\rbrack}}} = R},$wherein R is the data rate, and wherein the stochastic probability isbased on a current channel gain and a history of transmit gains.
 7. Themethod of claim 1, wherein the communications channel is a wirelesschannel.
 8. The method of claim 7, wherein said characterizing includes:identifying a coherence time for the channel; identifying a noise power,and modeling a channel gain probability density function.
 9. The methodof claim 1, wherein said determining includes: maximizing a throughputfunction that accounts an expected service time for transmitting acodeword to a receiver and an average amount of data carried by eachcodeword.
 10. The method of claim 9, wherein the expected service timeis expressible as a function of data rate.
 11. The method of claim 9,wherein the expected serviced time is expressible as a function ofcoding delay.
 12. The method of claim 9, wherein the expected servicetime accounts for a power allocation strategy.
 13. The method of claim9, wherein the expected service time accounts for delay constraints. 14.The method of claim 9, wherein the expected service time accounts forretransmission constraints.
 15. The method of claim 9, wherein theexpected service time accounts for outage probability.
 16. The method ofclaim 9, wherein the expected service time accounts for receiverdecoding strategy.
 17. The method of claim 16, wherein the receiverdecoding strategy includes: discarding incorrectly received codewords;and requesting re-transmission of the incorrectly received codewords.18. The method of claim 16, wherein the receiver decoding strategyincludes: requesting re-transmission of incorrectly decoded codewords;and combining re-transmitted codewords with incorrectly decodedcodewords to decode the re-transmitted codewords.
 19. The method ofclaim 9, wherein the throughput function is expressible as:${{T\left( {R,\gamma,K} \right)} = \frac{R}{E\left\lbrack {S\left( {R,\gamma,K} \right)} \right\rbrack}},$wherein R is the data rate, γ is the transmit power, K is the codingdelay, and E[S(R,γ,K)] is the expected service time.
 20. The method ofclaim 19, wherein the expected service time is expressible as:${{E\left\lbrack {S\left( {R,P_{av},K} \right)} \right\rbrack} = \frac{1}{1 - {P_{out}\left( {R,P_{av},K} \right)}}},$wherein P_(av) is the average transmit power, and P_(out)(R,P_(av),K) isthe probability of a channel outage.
 21. The method of claim 19, whereinthe expected service time is expressible as:${{E\left\lbrack {S\left( {R,P_{av},{K = 1}} \right)} \right\rbrack} = \frac{e^{R} + P_{av} - 1}{P_{av}}},$wherein P_(av) is the average transmit power.
 22. The method of claim19, wherein the expected service time is expressible as:${{E\left\lbrack {S\left( {R,P_{av},K,L} \right)} \right\rbrack} = \frac{1 - \left\lbrack {P_{out}\left( {R,P_{av},K} \right)} \right\rbrack^{L}}{1 - {P_{out}\left( {R,P_{av},K} \right)}}},$wherein P_(av) is the average transmit power, P_(out)(R,P_(av),K)is theprobability of a channel outage, and L is the maximum number oftransmission attempts per codeword.
 23. A transceiver that comprises: areceiver configured to receive information characterizing acommunications channel; and a transmitter configured to process saidinformation to determine a data rate that maximizes a throughput for thecommunications channel, and further configured to provide a transmitsignal to the communications channel using said data rate.
 24. Thetransceiver of claim 23, wherein as part of determining a data rate thatmaximizes a throughput for the communications channel, the transmitteris configured to jointly determine a power allocation strategy thatmaximizes the throughput subject to a power constraint.
 25. Thetransceiver of claim 24, wherein the power allocation strategy minimizesa channel outage probability.
 26. The transceiver of claim 23, whereinthe communications channel is a fading channel.
 27. The transceiver ofclaim 26, wherein the information characterizing the channel includes acoherence time for the channel, a noise power, and model for a channelgain probability density function.
 28. The transceiver of claim 23,wherein as part of determining a data rate, the transceiver maximizes achannel throughput function that accounts for an expected service timefor transmitting a codeword to a remote receiver.
 29. The transceiver ofclaim 28, wherein the expected service time accounts for data rate andcoding delay.
 30. The transceiver of claim 29, wherein the expectedservice time further accounts for constraints on power, coding delay,and retransmission attempts.
 31. The transceiver of claim 29, whereinthe expected service time further accounts for outage probability andreceiver decoding strategy.
 32. The transceiver of claim 28, wherein thethroughput function is expressible as:${{T\left( {R,\gamma,K} \right)} = \frac{R}{E\left\lbrack {S\left( {R,\gamma,K} \right)} \right\rbrack}},$wherein R is the data rate, γ is the transmit power, K is the codingdelay, and E[S(R,γ,K)] is the expected service time.
 33. The transceiverof claim 32, wherein the expected service time is expressible as:${{E\left\lbrack {S\left( {R,P_{av},K,L} \right)} \right\rbrack} = \frac{1 - \left\lbrack {P_{out}\left( {R,P_{av},K} \right)} \right\rbrack^{L}}{1 - {P_{out}\left( {R,P_{av},K} \right)}}},$wherein P_(av) is the average transmit power, P_(out)(R,P_(av),K)is theprobability of a channel outage, and L is the maximum number oftransmission attempts per codeword.
 34. A wireless communications systemthat comprises: a remote transceiver configured to send informationcharacterizing a communications channel; and a local transceiverconfigured to receive said information and to process said informationto determine a data rate that maximizes a throughput for thecommunications channel, and further configured to transmit data to theremote transceiver using said data rate.